If generative AI represents significant business potential for your future, don't let a relational database keep you stuck in the past.
Sometimes it’s healthy to zoom out of the current tech frenzy and consider that the first databases were developed in the mid-1960s, with relational databases following in the early 1970s. Today very few of us still use technology from a half-century ago. Imagine using a 1965 vintage computer or a dot matrix printer to do your work.
For all this progress, the incredible staying power of the relational database still dominates data management today. Its sweet spot remains what it was originally designed for – numbers and text organized in rows and columns.
Meanwhile, dozens of subsequent data management eras have flourished since the 1970s, each designed for a new kind of application of the day: columnar, NOSQL, databases designed for unstructured data, and the current craze over cloud-based data warehousing, to name just a few. Now there’s a new wave of databases pending, brought on the swelling interest in generative AI: vector databases. Like any hot trend, it’s important to understand its strengths and limitations and above all to assess the vector database’s innate ability to perform vector processing across multiple use cases.
Why we need yet another database “physics”
Vector databases and AI are an ideal pairing, since AI generally, and large language models (LLMs) specifically, rely on processing data in vectors and matrices. This isn’t a programming preference; it’s just how the fundamental mathematics of AI works, most notably but not exclusively in the deep neural networks that are at the heart of the LLM. The sequence is Prompt -> Vector in -> Search -> Vector out -> Output.
Some sources estimate that 80 to 90 percent of the world’s data is unstructured – audio, images, video, social media posts – which relational databases and data warehouses were never designed to make sense of. Data lakes, which do store unstructured data well, are limited by their abilities to allow the computation of these data types. Both RDBMS and data lakes can play a role in AI – as a store of data to feed into an AI app or as a storage location for the same app’s output – but they are at best a sub-optimal fit for this kind of data.
Why are vector databases ideal for generative AI?
Here are five reasons why vector databases are so well suited for generative AI:
They are a perfect fit for AI applications. Vector databases were designed to ingest structured and unstructured data, process it, and run models based on it. Advanced vector databases can hold raw data appropriate for AI analysis (for example, time series) and search from the prompts, much more so than products that merely “fudge” vector databases by holding structured vector embeddings.
They let you optimize temporal data. Because vector databases can organize data based on time, they allow you to playback and inspect history. This allows you to quickly understand how an automated system behaved, why a compliance system picked up an anomaly, or how a trade unraveled. In practical terms, you can rule in use cases of real-time transaction management, risk modeling, reconciliation, and informed predictive maintenance and health workflows.
They are extremely efficient. Vector embeddings deploy “search queries” like approximate nearest neighbor search over indices and vector embeddings, which are naturally vector-based, to accelerate retrieval speeds. That is why vector databases can process vector-based workloads of all types 100 times faster than traditional data stores and at a fraction of the query cost. Vector databases also allow you to query more dimensions of your data, since most data coming off of machines or in software is time-stamped. This allows you to apply more interesting searches and derive greater insight from your vectors. Finally, vector embeddings encode large language models, but vectors should – and in some cases do – service large data models.
They get better over time. Vector databases leverage native capabilities of – and breakthrough innovations in – CPU architectures, such as multicore processors, neural processing units, and heterogeneous computing (GPU + CPU on the same chip). For example, GPUs are natively vector-based, which is why vector databases utilize them for optimized search, but the best vector databases can leverage optimal compute in-memory, too.
They don’t rely on GPUs. When vectorized programming – and storage – accompanies vector databases, whether searching vector embeddings or for other key data such as anomalies, they may not need GPUs. That saves cost and resources, which is very much a consideration at the present moment based on the increasing cost of compute and dependence on availability of GPU chips.
Scaling value with generative AI
So what kind of value can you realize by building generative AI apps – indeed all types of AI analytics apps – in a true vector database?
If you are looking to process time-sensitive (including real-time) data, there are few limits to the value you can realize, whether your goal is recommendation engines, IoT-based automation, or natural language understanding. In many scenarios, real-time data is highly confidential, high fidelity, and thus highly valuable. A virtue of a vector database for real-time and historical data is that you can work with originating data without the need for vector embeddings, and apply faster, optimized search methods directly on the data. This reduces the latency between training, feature engineering and inference once trained, so you do not end up paying double, triple, 10x, or 100x for disjointed and often unnecessary data processing, aggregation, cleansing, and data transformation.
Given their high scalability, ability to perform similarity-based search, and integration with machine learning models, vector databases that can mine both LLMs and Language Data Models (those that work directly on source data) can deliver:
Vastly improved customer experience. AI powered by vector processing can provide more personalized recommendations faster, and eliminate the ‘now-what’ moment when customers reach a service rep with little to no awareness of the customer-vendor relationship.
More innovative products delivered to market faster. Pattern identification is a key benefit of all AI – not just generative AI. When supported by historical and real-time data, companies can make decisions that are useful direct from a linguistic prompt, e.g., confidently predict market trends, make better strategic decisions, and develop product innovations that meet customer needs.
Higher productivity. When the insights are right and delivered in a timely fashion, organizations such as manufacturers and medical organizations can reduce waste and improve output. This empowers individual operators on shop floors to take corrective action in the moment versus waiting until the end of the day, while managers can continually monitor processes and quality to assess product failure and reduce wastefulness to liberate jobs (versus eliminating them).
From POC to ROI
Introducing any technology new to your ecosystem recommends a proof of concept. Vector databases are no exception. Given the clear business advantages of vector processing, they will yield greater benefits if you treat your proof of concept as a pilot. In other words, look at your POC less as an experiment than something to quickly move to production once successful. Focus on an application with broad possibilities versus a highly specific use case, and ideally anticipate second, third, and fourth use cases. Build levels of abstraction into the pilot that will become important when rolling out data and analytics, and scaling your application across your enterprise.
When choosing a vector database vendor, look not just at versatility with use cases beyond searching for vector embeddings, but also for organizations with proven customer success, ideally one with a proper customer success team. They can bring deep experience in data modeling and preparation, and exemplify the success of others for your success.
Prioritize proven vendors with track records, great support, and responsive customer service. Flashy technology presented at meetups doesn’t equate to enterprise reach, resilience, and success.
Successful AI deployments do require serious planning, architecture, thought, advice, and governance. Look at all of these factors when considering a vendor. Of course, if you’re building an application for generative AI, save yourself the headache of relational databases and start with vectors, but look beyond vector embeddings and generated content into golden use cases that make a difference. You’ll be glad you did.
I like the conversation of ensuring your POC scales. That's where most fail.