Recent Content
How to get filename from file reader
I need to get the name of the file read by the file reader snap and use it as part of the data downstream. Really the goal to save the file name as part of the data pulled from a file. Screen snippet attached here. I have spent some time looking into this but there is no obvious method to me. Please I will appreciate any input and recommendations. Thanks.33Views0likes2CommentsGenerate expression file from database query
For some data transformations I would like to use an expression file that is generated each night, instead of querying a SQL database everytime the pipeline is started. I already have data available in the database and now I need to get the data transformed in the expression file JSON format, but I am stuck on getting the right ouput. Coming from a XML oriented environment (with extensive knowledge in XSL but not so much JSON) I have quite some issues with switching to snaps and JSON... Data sample (JSON) from the database [ { "code": "ARTICLEGROUP", "source": "JLG", "target": "10" }, { "code": "COMMODITYCODE", "source": "31251501", "target": "0" }, { "code": "COUNTRYCODE", "source": "AF", "target": "AF" }, { "code": "COUNTRYCODE", "source": "AL", "target": "AL" }, { "code": "COUNTRYCODE", "source": "DZ", "target": "DZ" }, { "code": "COUNTRYCODE", "source": "AS", "target": "AS" }, { "code": "COUNTRYCODE", "source": "AD", "target": "AD" }, { "code": "COUNTRY_ISOCODE", "source": "ARE", "target": "AE" }, { "code": "COUNTRY_ISOCODE", "source": "AFG", "target": "AF" }, { "code": "COUNTRY_ISOCODE", "source": "ALA", "target": "AX" }, { "code": "COUNTRY_ISOCODE", "source": "ALB", "target": "AL" }, { "code": "UOM", "source": "EA", "target": "pi" }, { "code": "UOM", "source": "M", "target": "me" }, { "code": "UOM", "source": "BG", "target": "za" } ] Desired output { "ARTICLEGROUP" : { "JLG": "10" }, "COMMODITYCODE" : { "31251501": "0" }, "COUNTRYCODE" : { "AF": "AF", "AL": "AL", "DZ": "DZ", "AS": "AS", "AD": "AD" }, "COUNTRY_ISOCODE" : { "ARE": "AE", "AFG": "AF", "ALA": "AX", "ALB": "AL" }, "UOM" : { "EA": "pi", "M": "me", "BG": "za" } , getValue : (type, source) => this[type][source] } Anyone can point me in the right direction? Have tried multiple things already, but I can't get the "arrays" right for some reason.Can we generate XML file in pretty print format using native snapLogic snaps?
Hi Team, I was curious to know if anybody has worked on a use case where they are generating an XML file in pretty print format? We do have "pretty-print" option in JSON formatter however the same is not available in XML formatter snap. Any suggestions? Thanking in advance. Best Regards, DarshSolved691Views0likes3Comments401 error with HTTP Client and NTLM
Hello, I'm trying to connect to an API with NTLM authentication using the Snap HTTP Client. Problem: I'm getting a 401 - Unauthorized response from the endpoint. The same request responds successfully on Postman. I think the problem comes from the Linux Groundplex. Did anyone had the same issue? How did you solve it? Thank you.SolvedIngesting Data into Veeva Vault CRM via SnapLogic – Alternatives to SFDC Snaps
We are currently in the process of migrating from our existing Veeva CRM (Salesforce-based) platform to Veeva Vault CRM. In our current integration landscape, we use SnapLogic to ingest data from our Specialty Pharma SFTP source into Veeva CRM, leveraging the Salesforce (SFDC) snaps for data ingestion and transformation. However, as we transition to Vault CRM, we’ve identified a gap—SnapLogic does not currently provide a native Snap pack for Veeva Vault CRM. We understand that support for Vault CRM is on SnapLogic’s product roadmap, but it is not expected in the immediate future. As part of our integration planning, we are reaching out to the SnapLogic community and experts to explore the following: Are there any existing Snap packs (e.g., REST, HTTP Client, SOAP, or JDBC snaps) that can be configured to support integration with Vault CRM? Has anyone implemented custom pipelines or reusable components for Vault CRM ingestion using generic SnapLogic snaps? Any known limitations, authentication considerations or Vault-specific constraints we should be aware of when building these integrations? We greatly appreciate any insights, lessons learned, or recommendations from those who have explored similar integration use cases. Thank you in advance for your time and input.26Views0likes2CommentsSnapLogic MCP Support
8 MIN READ Introduction Since the inception of the Model Context Protocol (MCP), we've been envisioning and designing how it can be integrated into the SnapLogic platform. We've recently received a significant number of inquiries about MCP, and we're excited to share our progress, the features we'll be supporting, our release timeline, and how you can get started creating MCP servers and clients within SnapLogic. If you're interested, we encourage you to reach out! Understanding the MCP Protocol The MCP protocol allows tools, data resources, and prompts to be published by an MCP server in a way that Large Language Models (LLMs) can understand. This empowers LLMs to autonomously interact with these resources via an MCP client, expanding their capabilities to perform actions, retrieve information, and execute complex workflows. MCP Protocol primarily supports: Tools: Functions an LLM can invoke (e.g., data lookups, operational tasks). Resources: File-like data an LLM can read (e.g., API responses, file contents). Prompts: Pre-written templates to guide LLM interaction with the server. Sampling (not widely used): Allows client-hosted LLMs to be used by remote MCP servers. An MCP client can, therefore, request to list available tools, call specific tools, list resources, or read resource content from a server. Transport and Authentication MCP protocol offers flexible transport options, including STDIO or HTTP (SSE or Streamable-HTTP) for local deployments, and HTTP (SSE or Streamable-HTTP) for remote deployments. While the protocol proposes OAuth 2.1 for authentication, an MCP server can also use custom headers for security. Release Timeline We're excited to bring MCP support to SnapLogic with two key releases: August Release: MCP Client Support We'll be releasing two new snaps: the MCP Function Generator Snap and the MCP Invoke Snap. These will be available in the AgentCreator Experimental (Beta) Snap Pack. With these Snaps, your SnapLogic agent can access the services and resources available on the public MCP server. Late Q3 Release: MCP Server Support Our initial MCP server support will focus on tool operations, including the ability to list tools and call tools. For authentication, it will support custom header-based authentication. Users will be able to leverage the MCP Server functionality by subscribing to this feature. If you're eager to be among the first to test these new capabilities and provide feedback, please reach out to the Project Manager Team, at pm-team@snaplogic.com. We're looking forward to seeing what you build with SnapLogic MCP. SnapLogic MCP Client MCP Clients in SnapLogic enable users to connect to MCP servers as part of their Agent. An example can be connecting to the Firecrawl MCP server for a data scraping Agent, or other use cases that can leverage the created MCP servers. The MCP Client support in SnapLogic consists of two Snaps, the MCP Function Generator Snap and the MCP Invoke Snap. From a high-level perspective, the MCP Function generator Snap allows users to list available tools from an MCP server, and the MCP Invoke Snap allows users to perform operations such as call tools, list resources, and read resources from an MCP server. Let’s dive into the individual pieces. MCP SSE Account To connect to an MCP Server, we will need an account to specify the URI of the server to connect to. Properties URI The URI of the server to connect to. Don’t need to include the /sse path Additional headers Additional HTTP headers to be sent to the server Timeout The timeout value in seconds, if the result is not returned within the timeout, the Snap will return an error. MCP Function Generator Snap The MCP Function Generator Snap enables users to retrieve the list of tools as SnapLogic function definitions to be used in a Tool Calling Snap. Properties Account An MCP SSE account is required to connect to an MCP Server. Expose Tools List all available tools from an MCP server as SnapLogic function definitions Expose Resources Add list_resources, read_resource as SnapLogic function definitions to allow LLMs to use resources/read and resources/list (MCP Resources). Definitions for list resource and read resource [ { "sl_type": "function", "name": "list_resources", "description": "This function lists all available resources on the MCP server. Return a list of resources with their URIs.", "strict": false, "sl_tool_metadata": { "operation": "resources/list" } }, { "sl_type": "function", "name": "read_resource", "description": "This function returns the content of the resource from the MCP server given the URI of the resource.", "strict": false, "sl_tool_metadata": { "operation": "resources/read" }, "parameters": [ { "name": "uri", "type": "STRING", "description": "Unique identifier for the resource", "required": true } ] } ] MCP Invoke Snap The MCP Invoke Snap enables users to perform operations such as tools/call, resources/list, and resources/read to an MCP server. Properties Account An account is required to use the MCP Invoke Snap Operation The operation to perform on the MCP server. The operation must be one of tools/call, resources/list, or resources/read Tool Name The name of the tool to call. Only enabled and required when the operation is tools/call Parameters The parameters to be added to the operation. Only enabled for resources/read and tools/call. Required for resources/read, and optional for tools/call, based on the tool. MCP Agents in pipeline action MCP Agent Driver pipeline An MCP Agent Driver pipeline is like any other MCP Agent pipeline; we’ll need to provide the system prompt, user prompt, and run it with the PipeLoop Snap. MCP Agent Worker pipeline Here’s an example of an MCP Agent with a single MCP Server connection. The MCP Agent Worker is connected to one MCP Server. MCP Client Snaps can be used together with AgentCreator Snaps, such as the Multi-Pipeline Function Generator and Pipeline Execute Snap, as SnapLogic Functions, tools. This allows users to use tools provided by MCP servers and internal tools, without sacrificing safety and freedom when building an Agent. Agent Worker with MCP Client Snaps SnapLogic MCP Server In SnapLogic, an MCP Server allows you to expose SnapLogic pipelines as dynamic tools that can be discovered and invoked by language models or external systems. By registering an MCP Server, you effectively provide a API that language models and other clients can use to perform operations such as data retrieval, transformation, enrichment, or automation, all orchestrated through SnapLogic pipelines. For the initial phase, we'll support connections to the server via HTTP + SSE. Core Capabilities The MCP Server provides two core capabilities. The first is listing tools, which returns structured metadata that describes the available pipelines. This metadata includes the tool name, a description, the input schema in JSON Schema format, and any additional relevant information. This allows clients to dynamically discover which operations are available for invocation. The second capability is calling tools, where a specific pipeline is executed as a tool using structured input parameters, and the output is returned. Both of these operations—tool listing and tool calling—are exposed through standard JSON-RPC methods, specifically tools/list and tools/call, accessible over HTTP. Prerequisite You'll need to prepare your tool pipelines in advance. During the server creation process, these can be added and exposed as tools for external LLMs to use. MCP Server Pipeline Components A typical MCP server pipeline consists of four Snaps, each with a dedicated role: 1. Router What it does: Routes incoming JSON requests—which differ from direct JSON-RPC requests sent by an MCP client—to either the list tools branch or the call tool branch. How: Examines the request payload (typically the method field) to determine which action to perform. 2. Multi-Pipeline Function Generator (Listing Tools) What it does: Converts a list of pipeline references into tool metadata. This is where you define the pipelines you want the server to expose as tools. Output: For each pipeline, generates: Tool name Description Parameters (as JSON Schema) Other metadata Purpose: Allows clients (e.g., an LLM) to query what tools are available without prior knowledge. 3. Pipeline Execute (Calling Tools) What it does: Dynamically invokes the selected SnapLogic pipeline and returns structured outputs. How: Accepts parameters encoded in the request body, maps them to the pipeline’s expected inputs, and executes the pipeline. Purpose: Provides flexible runtime execution of tools based on user or model requests. 4. Union What it does: Merges the result streams from both branches (list and call) into a single output stream for consistent response formatting. Request Flows Below are example flows showing how requests are processed: 🟢 tools/list Client sends a JSON-RPC request with method = "tools/list". Router directs the request to the Multi-Pipeline Function Generator. Tool metadata is generated and returned in the response. Union Snap merges and outputs the content. ✅ Result: The client receives a JSON list describing all available tools. �� tools/call Client sends a JSON-RPC request with method = "tools/call" and the tool name + parameters. Router sends this to the Pipeline Execute Snap. The selected pipeline is invoked with the given parameters. Output is collected and merged via Union. ✅ Result: The client gets the execution result of the selected tool. Registering an MCP Server Once your MCP server pipeline is created: Create a Trigger Task and Register as an MCP Server Navigate to the Designer > Create Trigger Task Choose a Groundplex. (Note: This capability currently requires a Groundplex, not a Cloudplex.) Select your MCP pipeline. Click Register as MCP server Configure node and authentication. Find your MCP Server URL Navigate to the Manager > Tasks The Task Details page exposes a unique HTTP endpoint. This endpoint is treated as your MCP Server URL. After registration, clients such as AI models or orchestration engines can interact with the MCP Server by calling the /tools/list endpoint to discover the available tools, and the /tools/call endpoint to invoke a specific tool using a structured JSON payload. Connect to a SnapLogic MCP Server from a Client After the MCP server is successfully published, using the SnapLogic MCP server is no different from using other MCP servers running in SSE mode. It can be connected to by any MCP client that supports SSE mode; all you need is the MCP Server URL (and the Bearer Token if authentication is enabled during server registration). Configuration First, you need to add your MCP server in the settings of the MCP client. Taking Claude Desktop as an example, you'll need to modify your Claude Desktop configuration file. The configuration file is typically located at: macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json Add your remote MCP server configuration to the mcpServers section: { "mcpServers": { "SL_MCP_server": { "command": "npx", "args": [ "mcp-remote", "http://devhost9000.example.com:9000/mcp/6873ff343a91cab6b00014a5/sse", "--header", "Authorization: Bearer your_token_here" ] } } } Key Components Server Name: SL_MCP_server - A unique identifier for your MCP server Command: npx - Uses the Node.js package runner to execute the mcp-remote package URL: The SSE endpoint URL of your remote MCP server (note the /sse suffix) Authentication: Use the --header flag to include authorization tokens if the server enabled authentication Requirements Ensure you have Node.js installed on your system, as the configuration uses npx to run the mcp-remote package. Replace the example URL and authorization token with your actual server details before saving the configuration. After updating the configuration file, restart Claude Desktop for the changes to take effect. To conclude, the MCP Server in SnapLogic is a framework that allows you to expose pipelines as dynamic tools accessible through a single HTTP endpoint. This capability is designed for integration with language models and external systems that need to discover and invoke SnapLogic workflows at runtime. MCP Servers make it possible to build flexible, composable APIs that return structured results, supporting use cases such as conversational AI, automated data orchestration, and intelligent application workflows. Conclusion SnapLogic's integration of the MCP protocol marks a significant leap forward in empowering LLMs to dynamically discover and invoke SnapLogic pipelines as sophisticated tools, transforming how you build conversational AI, automate complex data orchestrations, and create truly intelligent applications. We're excited to see the innovative solutions you'll develop with these powerful new capabilities.OpenAI Responses API
7 MIN READ Introduction OpenAI announced the Responses API, their most advanced and versatile interface for building intelligent AI applications. Supporting both text and image inputs with rich text outputs, this API enables dynamic, stateful conversations that remember and build on previous interactions, making AI experiences more natural and context-aware. It also unlocks powerful capabilities through built-in tools such as web search, file search, code interpreter, and more, while enabling seamless integration with external systems via function calling. Its event-driven design delivers clear, structured updates at every step, making it easier than ever to create sophisticated, multi-step AI workflows. Key features include: Stateful conversations via the previous response ID Built-in tools like web search, file search, code interpreter, MCP, and others Access to advanced models available exclusively, such as o1-pro Enhanced support for reasoning models with reasoning summaries and efficient context management through previous response ID or encrypted reasoning items Clear, event-based outputs that simplify integration and control While the Chat Completions API remains fully supported and widely used, OpenAI plans to retire the Assistants API in the first half of 2026. To support the adoption of the Responses API, two new Snaps have been introduced: OpenAI Chat Completions ⇒ OpenAI Responses API Generation OpenAI Tool Calling ⇒ OpenAI Responses API Tool Calling Both Snaps are fully compatible with existing upstream and downstream utility Snaps, including the OpenAI Prompt Generator, OpenAI Multimodal Content Generator, all Function Generators (Multi-Pipeline, OpenAPI, and APIM), the Function Result Generator, and the Message Appender. This allows existing pipelines and familiar development patterns to be reused while gaining access to the advanced features of the Responses API. OpenAI Responses API Generation The OpenAI Responses API Generation Snap is designed to support OpenAI’s newest Responses API, enabling more structured, stateful, and tool-augmented interactions. While it builds upon the familiar interface of the Chat Completions Snap, several new properties and behavioral updates have been introduced to align with the Responses API’s capabilities. New properties Message: The input sent to the LLM. This field replaces the previous Use message payload, Message payload, and Prompt properties in the OpenAI Chat Completions Snap, consolidating them into a single input. It removes ambiguity between "prompt" as raw text and as a template, and supports both string and list formats. Previous response ID: The unique ID of the previous response to the model. Use this to create multi-turn conversations. Model parameters Reasoning summary: For reasoning models, provides a summary of the model’s reasoning process, aiding in debugging and understanding the model's reasoning process. The property can be none, auto, or detailed. Advanced prompt configurations Instructions: Applied only to the current response, making them useful for dynamically swapping instructions between turns. To persist instructions across turns when using previous_response_id, the developer message in the OpenAI Prompt Generator Snap should be used. Advanced response configurations Truncation: Defines how to handle input that exceeds the model’s context window. auto allows the model to truncate the middle of the conversation to fit, while disabled (default) causes the request to fail with a 400 error if the context limit is exceeded. Include reasoning encrypted content: Includes an encrypted version of reasoning tokens in the output, allowing reasoning items to persist when the store is disabled. Built-in tools Web search: Enables the model to access up-to-date information from the internet to answer queries beyond its training data. Web search type Search context size User location: an approximate user location including city, region, country, and timezone to deliver more relevant search results. File search: Allows the model to retrieve information from documents or files. Vector store IDs Maximum number of results Include search results: Determines whether raw search results are included in the response for transparency or debugging. Ranker Score threshold Filters: Additional metadata-based filters to refine search results. For more details on using filters, see Metadata Filtering. Advanced tool configuration Tool choice: A new option, SPECIFY A BUILT-IN TOOL, allows specifying that the model should use a built-in tool to generate a response. Note that the OpenAI Responses API Generation Snap does not support the response count or stop sequences properties, as these are not available in the Responses API. Additionally, the message user name, which may be specified in the Prompt Generator Snap, is not supported and will be ignored if included. Model response of Chat Completions vs Responses API Chat Completions API Responses API The Responses API introduces an event-driven output structure that significantly enhances how developers build and manage AI-powered applications compared to the traditional Chat Completions API. While the Chat Completions API returns a single, plain-text response within the choices array, the Responses API provides an output array containing a sequence of semantic event items—such as reasoning, message, function_call, web_search_call, and more—that clearly delineate each step in the model's reasoning and actions. This structured approach allows developers to easily track and interpret the model's behavior, facilitating more robust error handling and smoother integration with external tools. Moreover, the response from the Responses API includes the model parameters settings, providing additional context for developers. Pipeline examples Built-in tool: web search This example demonstrates how to use the built-in web search tool. In this pipeline, the user’s location is specified to ensure the web search targets relevant geographic results. System prompt: You are a friendly and helpful assistant. Please use your judge to decide whether to use the appropriate tools or not to answer questions from the user. Prompt: Can you recommend 2 good sushi restaurants near me? Output: As a result, the output contains both a web search call and a message. The model uses the web search to find and provide recommendations based on current data, tailored to the specified location. Built-in tool: File search This example demonstrates how the built-in file search tool enables the model to retrieve information from documents stored in a vector store during response generation. In this case, the file wildfire_stats.pdf has been uploaded. You can create and manage vector stores through the Vector Store management page. Prompt: What is the number of Federal wildfires in 2018 Output: The output array contains a file_search_call event, which includes search results in its results field. These results provide matched text, metadata, and relevance scores from the vector store. This is followed by a message event, where the model uses the retrieved information to generate a grounded response. The presence of detailed results in the file_search_call is enabled by selecting the Include file search results option. OpenAI Responses API Tool Calling The OpenAI Responses API Tool Calling Snap is designed to support function calling using OpenAI’s Responses API. It works similarly to the OpenAI Tool Calling Snap (which uses the Chat Completions API), but is adapted to the event-driven response structure of the Responses API and supports stateful interactions via the previous response ID. While it shares much of its configuration with the Responses API Generation Snap, it is purpose-built for workflows involving function calls. Existing LLM agent pipeline patterns and utility Snaps—such as the Function Generator and Function Result Generator—can continue to be used with this Snap, just as with the original OpenAI Tool Calling Snap. The primary difference lies in adapting the Snap configuration to accommodate the Responses API’s event-driven output, particularly the structured function_call event item in the output array. The Responses API Tool Calling Snap provides two output views, similar to the OpenAI Tool Calling Snap, with enhancements to simplify building agent pipelines and support stateful interactions using the previous response ID: Model response view: The complete API response, including extra fields: messages: an empty list if store is enabled, or the full message history—including messages payload and model response—if disabled (similar to the OpenAI Tool Calling Snap). When using stateful workflows, message history isn’t needed because the previous response ID is used to maintain context. has_tool_call: a boolean indicating whether the response includes a tool call. Since the Responses API no longer includes the finish_reason: "tool_calls" field, this new field makes it easier to create stop conditions in the pipeloop Snap within the agent driver pipeline. Tool call view: Displays the list of function calls made by the model during the interaction. Tool Call View of Chat Completions vs Responses API Uses id as the function call identifier when sending back the function result. Tool call properties (name, arguments) are nested inside the function field. Each tool call includes: • id: the unique event ID • call_id: used to reference the function call when returning the result The tool call structure is flat — name and arguments are top-level fields. Building LLM Agent Pipelines To build LLM agent pipelines with the OpenAI Responses API Tool Calling Snap, you can reuse the same agent pipeline pattern described in Introducing Tool Calling Snaps and LLM Agent Pipelines. Only minor configuration changes are needed to support the Responses API. Agent Driver Pipeline The primary change is in the PipeLoop Snap configuration, where the stop condition should now check the has_tool_call field, since the Responses API no longer includes the finish_reason:"tool_calls". Agent Worker Pipeline Fields mapping A Mapper Snap is used to prepare the related fields for the OpenAI Responses API Tool Calling Snap. OpenAI Responses API Tool Calling The key changes are in this Snap’s configuration to support the Responses API’s stateful interactions. There are two supported approaches: Option 1: Use Store (Recommended) Leverages the built-in state management of the Responses API. Enable Store Use Previous Response ID Send only the function call results as the input messages for the next round. (messages field in the Snap’s output will be an empty array, so you can still use it in the Message Appender Snap to collect tool results.) Option 2: Maintain Conversation History in Pipeline Similar to the approach used in the Chat Completions API. Disable Store Include the full message history in the input (messages field in the Snap’s output contains message history) (Optional) Enable Include Reasoning Encrypted Content (for reasoning models) to preserve reasoning context efficiently OpenAI Function Result Generator As explained in Tool Call View of Chat Completions vs Responses API section, the Responses API includes both an id and a call_id. You must use the call_id to construct the function call result when sending it back to the model. Conclusion The OpenAI Responses API makes AI workflows smarter and more adaptable, with stateful interactions and built-in tools. SnapLogic’s OpenAI Responses API Generation and Tool Calling Snaps bring these capabilities directly into your pipelines, letting you take advantage of advanced features like built-in tools and event-based outputs with only minimal adjustments. By integrating these Snaps, you can seamlessly enhance your workflows and fully unlock the potential of the Responses API.Simplify Your LLM Workflows: Integrating Vertex AI RAG with SnapLogic
8 MIN READ This document explores the integration of Google Cloud's Vertex AI Retrieval Augmented Generation (RAG) capabilities with SnapLogic. We will delve into how Vertex AI RAG functions, its benefits over traditional vector databases, and practical applications within the SnapLogic platform. The guide will cover setting up and utilizing Vertex AI RAG, automating knowledge feeds, and integrating with SnapLogic's Generate snaps for enhanced LLM performance. Vertex AI RAG Engine The Vertex AI RAG Engine streamlines the retrieval-augmented generation (RAG) process through two primary steps: Knowledge Management: The Vertex AI RAG Engine establishes and maintains a knowledge base by creating a corpus, which serves as an index for storing source files. Retrieval Query: Upon receiving a prompt, the Vertex AI RAG Engine efficiently searches this knowledge base to identify and retrieve information most relevant to the request. The Vertex AI RAG Engine integrates Google Cloud's Vertex AI with the RAG architecture to produce accurate and contextually relevant LLM responses. It covers tasks related to managing knowledge by creating a corpus as an index for source files. For processing, it efficiently retrieves relevant information from this knowledge base when a prompt is received, then leverages the LLM to generate a response based on the retrieved context. Difference between Vector Database While both traditional vector databases and the Vertex AI RAG Engine are designed to enhance LLM responses by providing external knowledge, they differ significantly in their approach and capabilities. Vector Databases Vector databases primarily focus on storing and querying vector embeddings. To use them with an LLM for RAG, you typically need to: Manually manage embedding generation: You are responsible for generating vector embeddings for your source data using an embedding model. Handle retrieval logic: You need to implement the logic for querying the vector database, retrieving relevant embeddings, and then mapping them back to the original source text. Integrate with LLM: The retrieved text then needs to be explicitly passed to the LLM as part of the prompt. No built-in LLM integration: They are agnostic to the LLM and require manual integration for RAG workflows. Vertex AI RAG Engine The Vertex AI RAG Engine offers a more integrated and streamlined solution, abstracting away much of the complexity. Key differences include: Integrated knowledge management: It handles the entire lifecycle of your knowledge base, from ingesting raw source files to indexing and managing the corpus. You don't need to manually generate embeddings or manage vector storage. Automated retrieval: The engine automatically performs the retrieval of relevant information from its corpus based on the user's prompt. Seamless LLM integration: It's designed to work directly with Vertex AI's LLMs, handling the contextualization of the prompt with retrieved information before passing it to the LLM. End-to-end solution: It provides a more comprehensive solution for RAG, simplifying the development and deployment of RAG-powered applications. In essence, a traditional vector database is a component that requires significant orchestration to implement RAG. In contrast, the Vertex AI RAG Engine is a more complete, managed service that simplifies the entire RAG workflow by providing integrated knowledge management, retrieval, and LLM integration. This fundamental benefit allows for a significant simplification of the often complex RAG processing pipeline. By streamlining this process, we can achieve greater efficiency, reduce potential points of failure, and ultimately deliver more accurate and relevant results when leveraging large language models (LLMs) for tasks that require external knowledge. This simplification not only improves performance but also enhances the overall manageability and scalability of RAG-based systems, making them more accessible and effective for a wider range of applications. Using Vertex AI's RAG Engine with Generative AI (instead of directly via the Gemini API) offers advantages. It enhances query-related information retrieval through built-in tools, streamlining data access for generative AI models. This native integration within Vertex AI optimizes information flow, reduces complexity, and leads to a more robust system for retrieval-augmented generation. Vertex AI RAG Engine in SnapLogic SnapLogic now includes a set of Snaps for utilizing the Vertex AI RAG Engine. Corpus Management The following Snaps are available for managing RAG corpora: Google Vertex AI RAG Create Corpus Google Vertex AI RAG List Corpus Google Vertex AI RAG Get Corpus Google Vertex AI RAG Delete Corpus File Management in Corpus The following Snaps enable file management within a RAG corpus: Google Vertex AI RAG Corpus Add File Google Vertex AI RAG Corpus List File Google Vertex AI RAG Corpus Get File Google Vertex AI RAG Corpus Remove File Retrieval For performing retrieval operations, use the following Snap: Google Vertex AI RAG Retrieval Query Try using Vertex AI RAG Let's walk through a practical example of how to leverage the Vertex AI RAG Engine within SnapLogic. This section will guide you through setting up a corpus, adding files, performing retrieval queries, and integrating the results into your LLM applications. Preparing step Before integration, two key steps are required: First, set up a Google Cloud project with enabled APIs, linked billing, and necessary permissions. List of required enabled Google API https://console.cloud.google.com/apis/library/cloudresourcemanager.googleapis.com https://console.cloud.google.com/apis/library/aiplatform.googleapis.com SnapLogic offers two primary methods for connecting to Google Cloud APIs: Service Account (recommended): SnapLogic can utilize an existing Service Account that possesses the necessary permissions. OAuth2: This method requires configuring OAuth2. Access Token: An Access Token is a temporary security credential to access Google Cloud APIs. It requires manual refreshing of the token when it expires. Create the corpus To build the corpus, use the Google Vertex AI RAG Create Corpus Snap. Place the Google Vertex AI RAG Create Corpus Snap. Create Google GenAI Service Account Upload the Service account JSON key file that you obtained from Google Cloud Platform, and then select the project and resource location you want to use. We recommend using the “us-central1” location. Edit the configuration by setting the display name and the Snap execution to "Validate & Execute." Validate the pipeline to obtain the corpus result in the output. If the result is similar to the image above, you now have the corpus ready to add the document. Upload the document To upload documents for Google Vertex AI RAG, integrate SnapLogic using a pipeline connecting the "Google Vertex AI RAG Corpus Add File" and "File Reader" Snaps. The "File Reader" accesses the document, passing its content to the "Google Vertex AI RAG Corpus Add File" Snap, which uploads it to a specified Vertex AI RAG corpus. Example Download the example document. Example file: Basics of SnapLogic.pdf Configure the File Reader Snap as follows: Configure the Corpus Add File Snap as follows: These steps will add the Basics of SnapLogic.pdf to the corpus in the previous section. If you run the pipeline successfully, the output will appear as follows. Retrieve query This section demonstrates how to use the Google Vertex AI RAG Retrieval Query Snap to fetch relevant information from the corpus. This snap takes a user query and returns the most pertinent documents or text snippets. Example From the existing corpus, we will query the question "What snap types does SnapLogic have?" and configure the snap accordingly. The result will display a list of text chunks related to the question, ordered by score value. The score value is calculated from the similarity or distance between the query and each text chunk. The similarity or distance depends on the vectorDB that you choose. By default, the score is the COSINE_DISTANCE. Generate the result Now that we have successfully retrieved relevant information from our corpus, the next crucial step is to leverage this retrieved context to generate a coherent and accurate response using an LLM. This section will demonstrate how to integrate the results from the Google Vertex AI RAG Retrieval Query Snap with a generative AI model, such as the Google Gemini Generate Snap, to produce a final answer based on the augmented information. Here's an example prompt to use in the prompt generator: The final answer will appear as follows: Additionally, the integration between Vertex AI RAG and SnapLogic provides the significant benefit of cross-model compatibility. This means that the established RAG workflows and data retrieval processes can be seamlessly adapted and utilized with different large language models beyond just Vertex AI, such as open-source models or other commercial LLMs. This flexibility allows organizations to leverage their investment in RAG infrastructure across a diverse ecosystem of AI models, enabling greater adaptability, future-proofing of applications, and the ability to choose the best-suited LLM for specific tasks without rebuilding the entire information retrieval pipeline. This cross-model benefit ensures that the RAG solution remains versatile and valuable, regardless of evolving LLM landscapes. Auto-retrieve query with the Vertex AI built-in tool Using the built-in tool in the Vertex AI Gemini Generate Snap for auto-retrieval significantly simplifies the RAG pipeline. Instead of manually performing a retrieval query and then passing the results to a separate generation step, this integrated approach allows the Gemini model to automatically consult the configured RAG corpus based on the input prompt. This reduces the number of steps and the complexity of the pipeline, as the retrieval and generation processes are seamlessly handled within a single Snap. It ensures that the LLM always has access to the most relevant contextual information from your knowledge base without requiring explicit orchestration, leading to more efficient and accurate content generation. The snap configuration example below demonstrates how to configure the Built-in tools section. Specifically, we select the vertexRagStore type and designate the target corpus. The final answer generated using the auto-retrieval process will be displayed below. The response includes grounding metadata for source tracking, allowing users to trace information origins. This feature enhances transparency, fact-verification, and builds trust in content accuracy and reliability. Users can delve into source material, cross-reference facts, and gain a complete understanding, boosting the system's utility and trustworthiness. Summary This document demonstrates how to integrate Google Cloud's Vertex AI Retrieval Augmented Generation (RAG) capabilities with SnapLogic to enhance LLM workflows. Key takeaways include: Streamlined RAG Process: Vertex AI RAG simplifies knowledge management and retrieval, abstracting away complexities like manual embedding generation and retrieval logic, which are typically required with traditional vector databases. Integrated Solution: Unlike standalone vector databases, Vertex AI RAG offers an end-to-end solution for RAG, handling everything from ingesting raw files to integrating with LLMs. SnapLogic Integration: SnapLogic provides dedicated Snaps for managing Vertex AI RAG corpora (creating, listing, getting, deleting), managing files within corpora (adding, listing, getting, removing), and performing retrieval queries. Practical Application: The guide provided a step-by-step example of setting up a corpus, uploading documents, performing retrieval queries using the Google Vertex AI RAG Retrieval Query Snap, and integrating the results with generative AI models like the Google Gemini Generate Snap for contextually accurate responses. Cross-Model Compatibility: A significant benefit of this integration is the ability to adapt established RAG workflows and data retrieval processes with various LLMs beyond just Vertex AI, including open-source and other commercial models, ensuring flexibility and future-proofing. Automated Retrieval with Built-in Tools: The integration allows for automated retrieval via built-in tools in the Vertex AI Gemini Generate Snap, simplifying the RAG pipeline by handling retrieval and generation seamlessly within a single step. By leveraging Vertex AI RAG with SnapLogic, organizations can simplify the development and deployment of RAG-powered applications, leading to more accurate, contextually relevant, and efficient LLM responses.Pagination and nextCursor in header
Hello all, I'm using a HTTP Client snap to retrieved a few thousands of records, and I need to use pagination. The system that I'm calling is using cursor based pagination. If the number of elements returned is higher than the limit defined, the response header will contain a "nextCursor" value that I need to use as parameter to the "cursor" key for the next call, and so on until no more "nextCursor". This should be working fine, however I can't seem to get the content of the response header for my next call. When I use Postman I can see that there is a header returned, and the value that I need is stored under the key "X-Pagination-Next-Cursor" and not "nextCursor" as I expected. How can I access the values of the header? In the Snap itself, in the Pagination section, there is a "Override headers" part that I tried to configure by mapping the "cursor" key with either $nextCursor, $headers.nextCursor or $headers.X-Pagination-Next-Cursor, but nothing works, I'm only getting the records from the first page, there is no failure and no pagination. Thanks in advance for any help! JFAgentic Builders Webinar Series - Integrated agentic workflows, built live, every week
Register Here>> The Agentic Builders webinar series is your step-by-step guide to designing powerful, AI-powered workflows that transform how work gets done. Across five live sessions, SnapLogic experts will show you how to connect your data, automate complex tasks, and empower teams to put AI to work across departments including: sales, finance, customer success, learning services, and revenue operations. What you’ll take away: See agentic workflows built live, integrating data sources and tools you already use. Learn how to automate high-value, high-effort tasks across your organization. Discover best practices for connecting CRM, support, LMS, and financial systems. Walk away with actionable steps to design your first (or next) agentic workflow. Starts August 28th and runs through September 25th. Explore the series!