Generation & Analysis of Tweets IV – RAG & Bedrock Knowledge Databases

30 May, 2024

1554

Finally, we’ve got time to show you some of the work we’ve been doing. Check the previous posts (three parts) to catch up:

Generation & Analysis of Tweets – III, AWS Bedrock & Claude Models

Bedrock is growing fast, as AWS is adding features quickly. One of the latest is the Knowledge Databases functionality:

“Amazon Bedrock’s knowledge bases allow you to gather data sources into a central repository of information. This enables you to create applications that utilize retrieval augmented generation (RAG), a technique where retrieving information from data sources enhances the generation of model responses”.

We’ve updated the Tweets example and replaced Langchain with this new feature to see the difference. We have to say that the experience has been relatively seamless. We found no issues integrating and creating the KD or ingesting and embedding the vector.

Architecture

We use what some would call “naive RAG” for this example. We’ll enrich a model using the data we retrieved directly from X (former Twitter) so the model can tell us about the latest AWS-released features of some services.

A bit about RAG Architecture or framework:

The RAG Framework provides the LLM model with external data sources to give augmented context to the source prompt. In this architecture, we provide the model with updated information about AWS features using Tweets from AWS.

We are not creating a new diagram, as it is virtually the same as this one:

The architecture only includes a new component—the rest is just configuration. That would be the Knowledge Database (and Datasource) connected to the AWS OpenSearch Serverless—that’s all. The source data would be located on the S3-Training datasets bucket.

Open Search Serverless

Now, let’s get to the cool stuff. 🙂 As a RAG database, we’ve chosen AWS Open Search Serveless – with the vector engine – a fully integrated database with the KD that is relatively easy to provision; setting it up properly in production is another matter – don’t be fooled. The design is much more streamlined than its managed counterpart, which can be a pain until you get it effectively up and running. This one is nicely designed with individual policies for security networking and minimal configuration components.

A word for the wise: You pay for what you use in terms of resources—computing and storage—but you pay by the hour of consumed OCUs, so letting this infra up and running can be expensive, especially for some home testing. There are cheaper alternative databases. Ideally, use TF or CDK to provision and delete this.

We need to provision a few components: policies—security, network—a collection to store our vectors and indexes (please check the documentation for more information), and the KD and data source.

Collection

We define a Collection with the VectorSearch engine type, and to save resources, we disable the standby_replicas option:

We need to create the different permissions and policies – security policies, role …

Permissions for the index:

Once provisioned, we will get an entry in the console that represents our collection:

The vector configuration and metadata:

Ingesting the data is a different matter. In this architecture, we ingest the data using ECS and a Python script, retrieving Tweets from X and then storing the data in our Vector Database. We are only going to comment on the KD part:

AWS Open Search Serverless is fully integrated with KD, so the configuration is pretty straightforward:

Once the KD is created and configured, we can upload the data to our vector store, creating vector embeddings with the LLM model. For this architecture, we’ve selected the model Titan Embeddings G1.

We are ingesting Tweets, so after some testing, we’ve selected the following values for the fixed-size chunking strategy that works well, at least for this dataset:

We also need to create a data source for the KD with some parameters:

Then, the ingestion can be started:

Finally, our data and vector embeddings are ingested and synchronized:

Chatting with your dataset

Using the AWS console, we can quickly test our architecture. We’ve selected the Claude v2.1 model in this example, but you can select others. We ask the model for the latest features of AWS EKS:

We can browse the citations [1] and [2] and the source from our Tweets dataset:

So, our embedding and model seem to work pretty well with our external data source 🙂

Previous articleGeneration & Analysis of Tweets – III, AWS Bedrock & Claude Models

Generation & Analysis of Tweets IV – RAG & Bedrock Knowledge Databases

Chatting with your dataset

Like this:

Related Articles