Blog Post

SnapLogic Technical Blog
8 MIN READ

Simplify Your LLM Workflows: Integrating Vertex AI RAG with SnapLogic

pawit_roy's avatar
pawit_roy
Employee
22 days ago

 

 

This document explores the integration of Google Cloud's Vertex AI Retrieval Augmented Generation (RAG) capabilities with SnapLogic. We will delve into how Vertex AI RAG functions, its benefits over traditional vector databases, and practical applications within the SnapLogic platform. The guide will cover setting up and utilizing Vertex AI RAG, automating knowledge feeds, and integrating with SnapLogic's Generate snaps for enhanced LLM performance.

Vertex AI RAG Engine

The Vertex AI RAG Engine streamlines the retrieval-augmented generation (RAG) process through two primary steps:

  • Knowledge Management: The Vertex AI RAG Engine establishes and maintains a knowledge base by creating a corpus, which serves as an index for storing source files.
  • Retrieval Query: Upon receiving a prompt, the Vertex AI RAG Engine efficiently searches this knowledge base to identify and retrieve information most relevant to the request.

The Vertex AI RAG Engine integrates Google Cloud's Vertex AI with the RAG architecture to produce accurate and contextually relevant LLM responses. It covers tasks related to managing knowledge by creating a corpus as an index for source files. For processing, it efficiently retrieves relevant information from this knowledge base when a prompt is received, then leverages the LLM to generate a response based on the retrieved context.

Difference between Vector Database

While both traditional vector databases and the Vertex AI RAG Engine are designed to enhance LLM responses by providing external knowledge, they differ significantly in their approach and capabilities.

Vector Databases

Vector databases primarily focus on storing and querying vector embeddings. To use them with an LLM for RAG, you typically need to:

  • Manually manage embedding generation: You are responsible for generating vector embeddings for your source data using an embedding model.
  • Handle retrieval logic: You need to implement the logic for querying the vector database, retrieving relevant embeddings, and then mapping them back to the original source text.
  • Integrate with LLM: The retrieved text then needs to be explicitly passed to the LLM as part of the prompt.
  • No built-in LLM integration: They are agnostic to the LLM and require manual integration for RAG workflows.

Vertex AI RAG Engine

The Vertex AI RAG Engine offers a more integrated and streamlined solution, abstracting away much of the complexity. Key differences include:

  • Integrated knowledge management: It handles the entire lifecycle of your knowledge base, from ingesting raw source files to indexing and managing the corpus. You don't need to manually generate embeddings or manage vector storage.
  • Automated retrieval: The engine automatically performs the retrieval of relevant information from its corpus based on the user's prompt.
  • Seamless LLM integration: It's designed to work directly with Vertex AI's LLMs, handling the contextualization of the prompt with retrieved information before passing it to the LLM.
  • End-to-end solution: It provides a more comprehensive solution for RAG, simplifying the development and deployment of RAG-powered applications.

In essence, a traditional vector database is a component that requires significant orchestration to implement RAG. In contrast, the Vertex AI RAG Engine is a more complete, managed service that simplifies the entire RAG workflow by providing integrated knowledge management, retrieval, and LLM integration.

This fundamental benefit allows for a significant simplification of the often complex RAG processing pipeline. By streamlining this process, we can achieve greater efficiency, reduce potential points of failure, and ultimately deliver more accurate and relevant results when leveraging large language models (LLMs) for tasks that require external knowledge. This simplification not only improves performance but also enhances the overall manageability and scalability of RAG-based systems, making them more accessible and effective for a wider range of applications.

Compare the indexing pipeline between the old and the new.Compare the RAG pipeline between the old and the new.

Using Vertex AI's RAG Engine with Generative AI (instead of directly via the Gemini API) offers advantages. It enhances query-related information retrieval through built-in tools, streamlining data access for generative AI models. This native integration within Vertex AI optimizes information flow, reduces complexity, and leads to a more robust system for retrieval-augmented generation.

 

Vertex AI RAG Engine in SnapLogic

SnapLogic now includes a set of Snaps for utilizing the Vertex AI RAG Engine.

Corpus Management

The following Snaps are available for managing RAG corpora:

  1. Google Vertex AI RAG Create Corpus
  2. Google Vertex AI RAG List Corpus
  3. Google Vertex AI RAG Get Corpus
  4. Google Vertex AI RAG Delete Corpus

 

File Management in Corpus

The following Snaps enable file management within a RAG corpus:

  1. Google Vertex AI RAG Corpus Add File
  2. Google Vertex AI RAG Corpus List File
  3. Google Vertex AI RAG Corpus Get File
  4. Google Vertex AI RAG Corpus Remove File

Retrieval

For performing retrieval operations, use the following Snap:

  1. Google Vertex AI RAG Retrieval Query

Try using Vertex AI RAG

Let's walk through a practical example of how to leverage the Vertex AI RAG Engine within SnapLogic. This section will guide you through setting up a corpus, adding files, performing retrieval queries, and integrating the results into your LLM applications.

Preparing step

Before integration, two key steps are required: First, set up a Google Cloud project with enabled APIs, linked billing, and necessary permissions.

List of required enabled Google API

SnapLogic offers two primary methods for connecting to Google Cloud APIs:

  1. Service Account (recommended): SnapLogic can utilize an existing Service Account that possesses the necessary permissions.
  2. OAuth2: This method requires configuring OAuth2.
  3. Access Token: An Access Token is a temporary security credential to access Google Cloud APIs. It requires manual refreshing of the token when it expires.

Create the corpus

To build the corpus, use the Google Vertex AI RAG Create Corpus Snap.

  1. Place the Google Vertex AI RAG Create Corpus Snap.
  2. Create Google GenAI Service Account

    Upload the Service account JSON key file that you obtained from Google Cloud Platform, and then select the project and resource location you want to use. We recommend using the “us-central1” location.

  3. Edit the configuration by setting the display name and the Snap execution to "Validate & Execute."
  4. Validate the pipeline to obtain the corpus result in the output.

    If the result is similar to the image above, you now have the corpus ready to add the document.

Upload the document

To upload documents for Google Vertex AI RAG, integrate SnapLogic using a pipeline connecting the "Google Vertex AI RAG Corpus Add File" and "File Reader" Snaps. The "File Reader" accesses the document, passing its content to the "Google Vertex AI RAG Corpus Add File" Snap, which uploads it to a specified Vertex AI RAG corpus. 

Example

  1. Download the example document. Example file: Basics of SnapLogic.pdf
  2. Configure the File Reader Snap as follows:
  3. Configure the Corpus Add File Snap as follows:

These steps will add the Basics of SnapLogic.pdf to the corpus in the previous section. If you run the pipeline successfully, the output will appear as follows.

Retrieve query

This section demonstrates how to use the Google Vertex AI RAG Retrieval Query Snap to fetch relevant information from the corpus. This snap takes a user query and returns the most pertinent documents or text snippets.

Example

From the existing corpus, we will query the question "What snap types does SnapLogic have?" and configure the snap accordingly.

The result will display a list of text chunks related to the question, ordered by score value. The score value is calculated from the similarity or distance between the query and each text chunk. The similarity or distance depends on the vectorDB that you choose. By default, the score is the COSINE_DISTANCE.

Generate the result

Now that we have successfully retrieved relevant information from our corpus, the next crucial step is to leverage this retrieved context to generate a coherent and accurate response using an LLM. This section will demonstrate how to integrate the results from the Google Vertex AI RAG Retrieval Query Snap with a generative AI model, such as the Google Gemini Generate Snap, to produce a final answer based on the augmented information.

Here's an example prompt to use in the prompt generator:

The final answer will appear as follows:

Additionally, the integration between Vertex AI RAG and SnapLogic provides the significant benefit of cross-model compatibility. This means that the established RAG workflows and data retrieval processes can be seamlessly adapted and utilized with different large language models beyond just Vertex AI, such as open-source models or other commercial LLMs. This flexibility allows organizations to leverage their investment in RAG infrastructure across a diverse ecosystem of AI models, enabling greater adaptability, future-proofing of applications, and the ability to choose the best-suited LLM for specific tasks without rebuilding the entire information retrieval pipeline. This cross-model benefit ensures that the RAG solution remains versatile and valuable, regardless of evolving LLM landscapes.

Auto-retrieve query with the Vertex AI built-in tool

Using the built-in tool in the Vertex AI Gemini Generate Snap for auto-retrieval significantly simplifies the RAG pipeline. Instead of manually performing a retrieval query and then passing the results to a separate generation step, this integrated approach allows the Gemini model to automatically consult the configured RAG corpus based on the input prompt. This reduces the number of steps and the complexity of the pipeline, as the retrieval and generation processes are seamlessly handled within a single Snap. It ensures that the LLM always has access to the most relevant contextual information from your knowledge base without requiring explicit orchestration, leading to more efficient and accurate content generation.

The snap configuration example below demonstrates how to configure the Built-in tools section. Specifically, we select the vertexRagStore type and designate the target corpus.

The final answer generated using the auto-retrieval process will be displayed below.

The response includes grounding metadata for source tracking, allowing users to trace information origins. This feature enhances transparency, fact-verification, and builds trust in content accuracy and reliability. Users can delve into source material, cross-reference facts, and gain a complete understanding, boosting the system's utility and trustworthiness.

Summary

This document demonstrates how to integrate Google Cloud's Vertex AI Retrieval Augmented Generation (RAG) capabilities with SnapLogic to enhance LLM workflows. Key takeaways include:

  • Streamlined RAG Process: Vertex AI RAG simplifies knowledge management and retrieval, abstracting away complexities like manual embedding generation and retrieval logic, which are typically required with traditional vector databases.
  • Integrated Solution: Unlike standalone vector databases, Vertex AI RAG offers an end-to-end solution for RAG, handling everything from ingesting raw files to integrating with LLMs.
  • SnapLogic Integration: SnapLogic provides dedicated Snaps for managing Vertex AI RAG corpora (creating, listing, getting, deleting), managing files within corpora (adding, listing, getting, removing), and performing retrieval queries.
  • Practical Application: The guide provided a step-by-step example of setting up a corpus, uploading documents, performing retrieval queries using the Google Vertex AI RAG Retrieval Query Snap, and integrating the results with generative AI models like the Google Gemini Generate Snap for contextually accurate responses.
  • Cross-Model Compatibility: A significant benefit of this integration is the ability to adapt established RAG workflows and data retrieval processes with various LLMs beyond just Vertex AI, including open-source and other commercial models, ensuring flexibility and future-proofing.
  • Automated Retrieval with Built-in Tools: The integration allows for automated retrieval via built-in tools in the Vertex AI Gemini Generate Snap, simplifying the RAG pipeline by handling retrieval and generation seamlessly within a single step.

By leveraging Vertex AI RAG with SnapLogic, organizations can simplify the development and deployment of RAG-powered applications, leading to more accurate, contextually relevant, and efficient LLM responses.

Updated 23 days ago
Version 1.0
No CommentsBe the first to comment