How Vector Databases Solve for AI Performance Degradation
Vector databases cover the start, finish, and middle ground of complex LLM searches without the performance degradation seen in traditional databases.
Imagine setting out on a long hike on a beautiful morning. You’ve got a day’s journey ahead of you, and you start strong. Even better, the blazes that mark your trail are freshly painted, and you’re confident you’ll reach your destination before sundown. But the trail markers start to fade, then disappear completely. Several paths seem to appear and confront you with a question that few hikers like: Which trail is the right one? The paths look similar, so the best direction is unclear. You take your best shot.
A similar choice-making phenomenon is at work in Large Language Models. As described in the recent paper “Lost in the Middle: How Language Models Use Long Contexts,” a research team at Stanford University found that “performance is often highest when relevant information occurs at the beginning or end of the input context, and significantly degrades when models must access relevant information in the middle of long contexts. Furthermore, performance substantially decreases as the input context grows longer, even for explicitly long-context models.”
What should we make of these findings, and what are the implications for businesses building enterprise AI applications?
Better context, better content
As a general rule, creating context for a question in an AI setting usually starts at the beginning of a search, while the question is posed at the end. Based on this order, information in the middle is sometimes dropped. Since similarity is the general guiding principle of AI, AI models take “informed guesses” as to what the best output to generate is undertaken, like those branching trails in the woods.
Conventional databases are not designed for this similarity computation required by AI applications to make these guesses, so they require time and computing power to find answers using tags, metadata, and keyword searching. It’s inefficient to store and retrieve complex data using their tabular formats, and so copious time and effort is spent deploying it to find the right path to a specific answer. Not to mention that traditional databases degrade as you add more data to them or try to run more analytics against them, which also increases latency and overhead costs.
How Vector Databases Help Make Path Choices Faster, and Better
A vector database encodes all the information about every potential ‘path’ in vector embeddings. Whether structured or unstructured information, each entity in your model – person, place, and concept – is represented by numbers in a vector space.
This is a far more efficient way of assembling data for analysis and allows for semantic relationships between entities and a much faster, more efficient path to an accurate answer. Vector databases are also built for scale, so there is no performance degradation of your LLM’s context or response.
Why does this matter? Because as LLMs continue to expand, they will need ever-greater and more specific context to generate useful content. If you want to know which of the 400 firewalls in your network is malfunctioning or which of the 25 million molecules in your dataset is most likely to activate a biological target, a generic answer is useless. The more specific the question you can answer, the higher the business value. Based on its unique properties, a vector database is like deploying a neodymium “super magnet” to find the one needle you need in an enormous haystack.
Without vectorized information, you must reduce the scope of your search to minimize cost, and, therefore sacrifice accuracy. This is a weak battery to find those needles in the haystack.
For example, you may try to gather all the context you have and put it into an LLM. The only problem is that access to most large, commercially available LLMs is charged on a per-token basis – tokens being the basic units of code or text that LLMs use to process language – which can get costly very quickly. There’s also the fact that from a pure feasibility point of view, there are limits on how many tokens most LLMs will allow you to feed in as input.
Preventing generative from becoming generic
Addressing the challenge of LLM performance degradation involves both the cost and effort of training your model and providing it with the proper context for searches so it generates answers you can use.
On both counts, vector databases solve cost and latency issues by preventing LLM degradation. Based on its use of vectorized data, a vector database acts more like Waze navigation than a topographical map, telling you when to turn and in which direction without showing you all the (mostly irrelevant) details of your trip or the overall shape of the information landscape. It’s a little bit like sequencing a DNA genome using fragment analysis rather than counting every one of your 3.2 billion A, C, G, and T nucleotides.
This combination of vector embeddings and semantic similarity search allows vector databases to cover the start, finish, and middle ground of a complex LLM search. If what you’re looking for is hidden within a single sentence that lives somewhere in 1,000 long documents, a vector database can use semantic search to reduce the context and find the answer. Also, in the case of advanced vector databases, operations happen in memory, so expensive, hard-to-find GPUs aren’t required.
The problem with most generative models is that they become too general. You're forced to choose at some point between very large general-purpose models, or much smaller but very domain-specific models. Getting to what some call an AI “golden use case” requires making meaningful productivity, efficiency, and innovation advances. Vector databases check all three boxes.
Vector Databases are here to stay
Guido Appenzeller, former CTO of Intel and a special advisor to Andreessen Horowitz notes that “vector databases…retrieve only relevant chunks of text via search, and feed a smaller amount of data into the LLM. This is already the dominant architecture today for cost reasons. The result is that this architecture is here to stay.”
Vectors are here to stay because they help developers when faced with that daunting choice of what to predict as the best output for generative AI. They turn to a vector database super magnet to help them find the right context needle in the haystack and find their way out of the woods.
Welcome, , to the Vector Database Central writing team! Samuel is a KDB+/Q Engineer at KX, living in the world of Vector Databases and Machine Learning.
Vector Database Central is a reader-supported publication sponsored by KX. To receive new posts and support our work, consider becoming a free or paid subscriber.