It might surprise those outside AI that a video about vector databases would attract more than 100,000 views (and counting), but Patrick Löber’s video, Vector Databases Explained! (Embeddings & Indexes), has done just that. It’s a fantastic tutorial, but naturally, with only four minutes to explain a complex technology, he omitted several foundational ideas.
Let’s review Löber’s main points and explore vector data management in terms of their practical applications, design, essential features, and demands we know of today.
“There is no doubt vector databases are fascinating and allow many great applications like similarity search, nearest neighbor, and recommendation engines.”
Yes! Similarity search, nearest neighbor, and recommendations are the essential queries vector databases are uniquely designed to solve. However, those aren’t applications, they’re types of queries. The applications you build on the back of these query patterns are what provide business value. These include customer support, predictive maintenance, and real-time predictive healthcare applications.
For example, a large telecommunications company uses vector database technology for customer call center conversational intelligence applications across dozens of locations against millions of monthly calls. That application extracts customer profile data when a support session is opened. As conversations occur, an LLM is used to perform a similarity search in real-time and uncover similar customer requests. The system augments the agents’ ability to respond with understanding, context, empathy, and awareness of problems other similar customers have had. By combining customer metadata with generative AI, more relevant, quality responses are generated based on similarity.
Any application that handles customer conversations and language can benefit from this use case pattern. And as it reveals, vector databases are insufficient by themselves. For example, for an application to find similarity, it must first understand the customer's profile that’s opened a specific call. That data may come from an IVR system, CRM, marketing database, or all three.
So, important AI applications require more than “just” a vector database, including:
1. The ability to easily combine data from disparate databases
2. The means to organize data by time
3. The ability to process data based on events
These three capabilities help provide a comprehensive platform to process vector database-style applications by applying similarity search across structured, unstructured, and vector embeddings for new and disruptive use cases.
“Vector databases are long-term memory for LLMs like GPT4”
“Long-term memory for LLMs” is an oversimplification of what databases do, and indeed, it is the tagline of vector database startup Pinecone. Just as acceleration is just one thing a car does, long-term memory is one thing a vector database must do.
That is, Löber presumes just one database interaction model, query-response, when you ask a question via ChatGPT and get a response. This is the easiest use case to understand, but although AI models trained on static, historical stores are helpful and common, many important use cases are more dynamic and operational.
In the case of our customer call center application example, it’s important to continually update the model based on up-to-date conversations and customer requests. For example, for a telecommunication call center, new pricing offers spark new customer questions. Network outages are often unique and difficult to anticipate, and programming changes also change the nature of questions and answers.
Again, the combination of vector and time series data enables event-driven applications to identify patterns and emit observations like, “I’ve seen a similar pattern to this and predict that X will happen; therefore, I recommend that you Y.” Unstructured data helps applications ask the right question, and generative AI with temporal data helps stitch together what followed one minute, ten minutes, one month, or one year after similar incidents.
“There are several vector databases available, including Pinecone, Weaviate, and Chroma; Redis also has a Vector database, as does Miluvs and Vespa AI.”
Except for Redis, databases Löber mentions are specialized vector databases – optimized for vector embeddings. As described here, vector search is just one piece of the vector data management puzzle. Since this video aired, popular databases have added vector capabilities, such as Elasticsearch 8.0, MongoDB Atlas, and KDB.AI. Each approach has limitations and advantages.
Specialized vector databases, unsurprisingly, are aimed directly at the newest, narrowest use cases of search-based generative AI. Most are under three years old and provide minimum viable capabilities in tooling, integration with other data sources, transaction capabilities, and data ops.
Conventional databases that add similarity search are often relatively mature but, not being built for vector-oriented data, are unlikely to scale for serious generative AI applications. That said, for small data sets or simple use cases, they may be suitable.
Hybrid time-series / vector databases are designed around vector-oriented data, support real-time event-driven data operations, and robustly integrate with the database ecosystem from Snowflake to Azure to Oracle.
Beyond search: event-driven vector databases
A comprehensive enterprise vector database solution will make generative AI-driven decisions triggered by real-time events, pull from a diverse variety of data stores, use similarity search to inform actions and provide robust data operations to work at the scale, reliability, and security needed by strategic enterprise applications.
Löber has set the table and answers the essence of what a vector database is and how they work. But as vector databases enter mainstream use and power important enterprise use cases, firms will demand more, and many technology vendors are eager to help build a new vector of growth for the database market.
Thank you for reviewing my video ;)