SnapLogic

17 Topics

SnapLogic MCP Support
Introduction Since the inception of the Model Context Protocol (MCP), we've been envisioning and designing how it can be integrated into the SnapLogic platform. We've recently received a significant number of inquiries about MCP, and we're excited to share our progress, the features we'll be supporting, our release timeline, and how you can get started creating MCP servers and clients within SnapLogic. If you're interested, we encourage you to reach out! Understanding the MCP Protocol The MCP protocol allows tools, data resources, and prompts to be published by an MCP server in a way that Large Language Models (LLMs) can understand. This empowers LLMs to autonomously interact with these resources via an MCP client, expanding their capabilities to perform actions, retrieve information, and execute complex workflows. MCP Protocol primarily supports: Tools: Functions an LLM can invoke (e.g., data lookups, operational tasks). Resources: File-like data an LLM can read (e.g., API responses, file contents). Prompts: Pre-written templates to guide LLM interaction with the server. Sampling (not widely used): Allows client-hosted LLMs to be used by remote MCP servers. An MCP client can, therefore, request to list available tools, call specific tools, list resources, or read resource content from a server. Transport and Authentication MCP protocol offers flexible transport options, including STDIO or HTTP (SSE or Streamable-HTTP) for local deployments, and HTTP (SSE or Streamable-HTTP) for remote deployments. While the protocol proposes OAuth 2.1 for authentication, an MCP server can also use custom headers for security. Release Timeline We're excited to bring MCP support to SnapLogic with two key releases: August Release: MCP Client Support We'll be releasing two new snaps: the MCP Function Generator Snap and the MCP Invoke Snap. These will be available in the AgentCreator Experimental (Beta) Snap Pack. With these Snaps, your SnapLogic agent can access the services and resources available on the public MCP server. Late Q3 Release: MCP Server Support Our initial MCP server support will focus on tool operations, including the ability to list tools and call tools. For authentication, it will support custom header-based authentication. Users will be able to leverage the MCP Server functionality by subscribing to this feature. If you're eager to be among the first to test these new capabilities and provide feedback, please reach out to the Project Manager Team, at pm-team@snaplogic.com. We're looking forward to seeing what you build with SnapLogic MCP. SnapLogic MCP Client MCP Clients in SnapLogic enable users to connect to MCP servers as part of their Agent. An example can be connecting to the Firecrawl MCP server for a data scraping Agent, or other use cases that can leverage the created MCP servers. The MCP Client support in SnapLogic consists of two Snaps, the MCP Function Generator Snap and the MCP Invoke Snap. From a high-level perspective, the MCP Function generator Snap allows users to list available tools from an MCP server, and the MCP Invoke Snap allows users to perform operations such as call tools, list resources, and read resources from an MCP server. Let’s dive into the individual pieces. MCP SSE Account To connect to an MCP Server, we will need an account to specify the URI of the server to connect to. Properties URI The URI of the server to connect to. Don’t need to include the /sse path Additional headers Additional HTTP headers to be sent to the server Timeout The timeout value in seconds, if the result is not returned within the timeout, the Snap will return an error. MCP Function Generator Snap The MCP Function Generator Snap enables users to retrieve the list of tools as SnapLogic function definitions to be used in a Tool Calling Snap. Properties Account An MCP SSE account is required to connect to an MCP Server. Expose Tools List all available tools from an MCP server as SnapLogic function definitions Expose Resources Add list_resources, read_resource as SnapLogic function definitions to allow LLMs to use resources/read and resources/list (MCP Resources). Definitions for list resource and read resource [ { "sl_type": "function", "name": "list_resources", "description": "This function lists all available resources on the MCP server. Return a list of resources with their URIs.", "strict": false, "sl_tool_metadata": { "operation": "resources/list" } }, { "sl_type": "function", "name": "read_resource", "description": "This function returns the content of the resource from the MCP server given the URI of the resource.", "strict": false, "sl_tool_metadata": { "operation": "resources/read" }, "parameters": [ { "name": "uri", "type": "STRING", "description": "Unique identifier for the resource", "required": true } ] } ] MCP Invoke Snap The MCP Invoke Snap enables users to perform operations such as tools/call, resources/list, and resources/read to an MCP server. Properties Account An account is required to use the MCP Invoke Snap Operation The operation to perform on the MCP server. The operation must be one of tools/call, resources/list, or resources/read Tool Name The name of the tool to call. Only enabled and required when the operation is tools/call Parameters The parameters to be added to the operation. Only enabled for resources/read and tools/call. Required for resources/read, and optional for tools/call, based on the tool. MCP Agents in pipeline action MCP Agent Driver pipeline An MCP Agent Driver pipeline is like any other MCP Agent pipeline; we’ll need to provide the system prompt, user prompt, and run it with the PipeLoop Snap. MCP Agent Worker pipeline Here’s an example of an MCP Agent with a single MCP Server connection. The MCP Agent Worker is connected to one MCP Server. MCP Client Snaps can be used together with AgentCreator Snaps, such as the Multi-Pipeline Function Generator and Pipeline Execute Snap, as SnapLogic Functions, tools. This allows users to use tools provided by MCP servers and internal tools, without sacrificing safety and freedom when building an Agent. Agent Worker with MCP Client Snaps SnapLogic MCP Server In SnapLogic, an MCP Server allows you to expose SnapLogic pipelines as dynamic tools that can be discovered and invoked by language models or external systems. By registering an MCP Server, you effectively provide a API that language models and other clients can use to perform operations such as data retrieval, transformation, enrichment, or automation, all orchestrated through SnapLogic pipelines. For the initial phase, we'll support connections to the server via HTTP + SSE. Core Capabilities The MCP Server provides two core capabilities. The first is listing tools, which returns structured metadata that describes the available pipelines. This metadata includes the tool name, a description, the input schema in JSON Schema format, and any additional relevant information. This allows clients to dynamically discover which operations are available for invocation. The second capability is calling tools, where a specific pipeline is executed as a tool using structured input parameters, and the output is returned. Both of these operations—tool listing and tool calling—are exposed through standard JSON-RPC methods, specifically tools/list and tools/call, accessible over HTTP. Prerequisite You'll need to prepare your tool pipelines in advance. During the server creation process, these can be added and exposed as tools for external LLMs to use. MCP Server Pipeline Components A typical MCP server pipeline consists of four Snaps, each with a dedicated role: 1. Router What it does: Routes incoming JSON requests—which differ from direct JSON-RPC requests sent by an MCP client—to either the list tools branch or the call tool branch. How: Examines the request payload (typically the method field) to determine which action to perform. 2. Multi-Pipeline Function Generator (Listing Tools) What it does: Converts a list of pipeline references into tool metadata. This is where you define the pipelines you want the server to expose as tools. Output: For each pipeline, generates: Tool name Description Parameters (as JSON Schema) Other metadata Purpose: Allows clients (e.g., an LLM) to query what tools are available without prior knowledge. 3. Pipeline Execute (Calling Tools) What it does: Dynamically invokes the selected SnapLogic pipeline and returns structured outputs. How: Accepts parameters encoded in the request body, maps them to the pipeline’s expected inputs, and executes the pipeline. Purpose: Provides flexible runtime execution of tools based on user or model requests. 4. Union What it does: Merges the result streams from both branches (list and call) into a single output stream for consistent response formatting. Request Flows Below are example flows showing how requests are processed: 🟢 tools/list Client sends a JSON-RPC request with method = "tools/list". Router directs the request to the Multi-Pipeline Function Generator. Tool metadata is generated and returned in the response. Union Snap merges and outputs the content. ✅ Result: The client receives a JSON list describing all available tools. �� tools/call Client sends a JSON-RPC request with method = "tools/call" and the tool name + parameters. Router sends this to the Pipeline Execute Snap. The selected pipeline is invoked with the given parameters. Output is collected and merged via Union. ✅ Result: The client gets the execution result of the selected tool. Registering an MCP Server Once your MCP server pipeline is created: Create a Trigger Task and Register as an MCP Server Navigate to the Designer > Create Trigger Task Choose a Groundplex. (Note: This capability currently requires a Groundplex, not a Cloudplex.) Select your MCP pipeline. Click Register as MCP server Configure node and authentication. Find your MCP Server URL Navigate to the Manager > Tasks The Task Details page exposes a unique HTTP endpoint. This endpoint is treated as your MCP Server URL. After registration, clients such as AI models or orchestration engines can interact with the MCP Server by calling the /tools/list endpoint to discover the available tools, and the /tools/call endpoint to invoke a specific tool using a structured JSON payload. Connect to a SnapLogic MCP Server from a Client After the MCP server is successfully published, using the SnapLogic MCP server is no different from using other MCP servers running in SSE mode. It can be connected to by any MCP client that supports SSE mode; all you need is the MCP Server URL (and the Bearer Token if authentication is enabled during server registration). Configuration First, you need to add your MCP server in the settings of the MCP client. Taking Claude Desktop as an example, you'll need to modify your Claude Desktop configuration file. The configuration file is typically located at: macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json Add your remote MCP server configuration to the mcpServers section: { "mcpServers": { "SL_MCP_server": { "command": "npx", "args": [ "mcp-remote", "http://devhost9000.example.com:9000/mcp/6873ff343a91cab6b00014a5/sse", "--header", "Authorization: Bearer your_token_here" ] } } } Key Components Server Name: SL_MCP_server - A unique identifier for your MCP server Command: npx - Uses the Node.js package runner to execute the mcp-remote package URL: The SSE endpoint URL of your remote MCP server (note the /sse suffix) Authentication: Use the --header flag to include authorization tokens if the server enabled authentication Requirements Ensure you have Node.js installed on your system, as the configuration uses npx to run the mcp-remote package. Replace the example URL and authorization token with your actual server details before saving the configuration. After updating the configuration file, restart Claude Desktop for the changes to take effect. To conclude, the MCP Server in SnapLogic is a framework that allows you to expose pipelines as dynamic tools accessible through a single HTTP endpoint. This capability is designed for integration with language models and external systems that need to discover and invoke SnapLogic workflows at runtime. MCP Servers make it possible to build flexible, composable APIs that return structured results, supporting use cases such as conversational AI, automated data orchestration, and intelligent application workflows. Conclusion SnapLogic's integration of the MCP protocol marks a significant leap forward in empowering LLMs to dynamically discover and invoke SnapLogic pipelines as sophisticated tools, transforming how you build conversational AI, automate complex data orchestrations, and create truly intelligent applications. We're excited to see the innovative solutions you'll develop with these powerful new capabilities.
tfan
14 days ago Place SnapLogic Technical Blog
278Views
0likes
0Comments
OpenAI Responses API
Introduction OpenAI announced the Responses API, their most advanced and versatile interface for building intelligent AI applications. Supporting both text and image inputs with rich text outputs, this API enables dynamic, stateful conversations that remember and build on previous interactions, making AI experiences more natural and context-aware. It also unlocks powerful capabilities through built-in tools such as web search, file search, code interpreter, and more, while enabling seamless integration with external systems via function calling. Its event-driven design delivers clear, structured updates at every step, making it easier than ever to create sophisticated, multi-step AI workflows. Key features include: Stateful conversations via the previous response ID Built-in tools like web search, file search, code interpreter, MCP, and others Access to advanced models available exclusively, such as o1-pro Enhanced support for reasoning models with reasoning summaries and efficient context management through previous response ID or encrypted reasoning items Clear, event-based outputs that simplify integration and control While the Chat Completions API remains fully supported and widely used, OpenAI plans to retire the Assistants API in the first half of 2026. To support the adoption of the Responses API, two new Snaps have been introduced: OpenAI Chat Completions ⇒ OpenAI Responses API Generation OpenAI Tool Calling ⇒ OpenAI Responses API Tool Calling Both Snaps are fully compatible with existing upstream and downstream utility Snaps, including the OpenAI Prompt Generator, OpenAI Multimodal Content Generator, all Function Generators (Multi-Pipeline, OpenAPI, and APIM), the Function Result Generator, and the Message Appender. This allows existing pipelines and familiar development patterns to be reused while gaining access to the advanced features of the Responses API. OpenAI Responses API Generation The OpenAI Responses API Generation Snap is designed to support OpenAI’s newest Responses API, enabling more structured, stateful, and tool-augmented interactions. While it builds upon the familiar interface of the Chat Completions Snap, several new properties and behavioral updates have been introduced to align with the Responses API’s capabilities. New properties Message: The input sent to the LLM. This field replaces the previous Use message payload, Message payload, and Prompt properties in the OpenAI Chat Completions Snap, consolidating them into a single input. It removes ambiguity between "prompt" as raw text and as a template, and supports both string and list formats. Previous response ID: The unique ID of the previous response to the model. Use this to create multi-turn conversations. Model parameters Reasoning summary: For reasoning models, provides a summary of the model’s reasoning process, aiding in debugging and understanding the model's reasoning process. The property can be none, auto, or detailed. Advanced prompt configurations Instructions: Applied only to the current response, making them useful for dynamically swapping instructions between turns. To persist instructions across turns when using previous_response_id, the developer message in the OpenAI Prompt Generator Snap should be used. Advanced response configurations Truncation: Defines how to handle input that exceeds the model’s context window. auto allows the model to truncate the middle of the conversation to fit, while disabled (default) causes the request to fail with a 400 error if the context limit is exceeded. Include reasoning encrypted content: Includes an encrypted version of reasoning tokens in the output, allowing reasoning items to persist when the store is disabled. Built-in tools Web search: Enables the model to access up-to-date information from the internet to answer queries beyond its training data. Web search type Search context size User location: an approximate user location including city, region, country, and timezone to deliver more relevant search results. File search: Allows the model to retrieve information from documents or files. Vector store IDs Maximum number of results Include search results: Determines whether raw search results are included in the response for transparency or debugging. Ranker Score threshold Filters: Additional metadata-based filters to refine search results. For more details on using filters, see Metadata Filtering. Advanced tool configuration Tool choice: A new option, SPECIFY A BUILT-IN TOOL, allows specifying that the model should use a built-in tool to generate a response. Note that the OpenAI Responses API Generation Snap does not support the response count or stop sequences properties, as these are not available in the Responses API. Additionally, the message user name, which may be specified in the Prompt Generator Snap, is not supported and will be ignored if included. Model response of Chat Completions vs Responses API Chat Completions API Responses API The Responses API introduces an event-driven output structure that significantly enhances how developers build and manage AI-powered applications compared to the traditional Chat Completions API. While the Chat Completions API returns a single, plain-text response within the choices array, the Responses API provides an output array containing a sequence of semantic event items—such as reasoning, message, function_call, web_search_call, and more—that clearly delineate each step in the model's reasoning and actions. This structured approach allows developers to easily track and interpret the model's behavior, facilitating more robust error handling and smoother integration with external tools. Moreover, the response from the Responses API includes the model parameters settings, providing additional context for developers. Pipeline examples Built-in tool: web search This example demonstrates how to use the built-in web search tool. In this pipeline, the user’s location is specified to ensure the web search targets relevant geographic results. System prompt: You are a friendly and helpful assistant. Please use your judge to decide whether to use the appropriate tools or not to answer questions from the user. Prompt: Can you recommend 2 good sushi restaurants near me? Output: As a result, the output contains both a web search call and a message. The model uses the web search to find and provide recommendations based on current data, tailored to the specified location. Built-in tool: File search This example demonstrates how the built-in file search tool enables the model to retrieve information from documents stored in a vector store during response generation. In this case, the file wildfire_stats.pdf has been uploaded. You can create and manage vector stores through the Vector Store management page. Prompt: What is the number of Federal wildfires in 2018 Output: The output array contains a file_search_call event, which includes search results in its results field. These results provide matched text, metadata, and relevance scores from the vector store. This is followed by a message event, where the model uses the retrieved information to generate a grounded response. The presence of detailed results in the file_search_call is enabled by selecting the Include file search results option. OpenAI Responses API Tool Calling The OpenAI Responses API Tool Calling Snap is designed to support function calling using OpenAI’s Responses API. It works similarly to the OpenAI Tool Calling Snap (which uses the Chat Completions API), but is adapted to the event-driven response structure of the Responses API and supports stateful interactions via the previous response ID. While it shares much of its configuration with the Responses API Generation Snap, it is purpose-built for workflows involving function calls. Existing LLM agent pipeline patterns and utility Snaps—such as the Function Generator and Function Result Generator—can continue to be used with this Snap, just as with the original OpenAI Tool Calling Snap. The primary difference lies in adapting the Snap configuration to accommodate the Responses API’s event-driven output, particularly the structured function_call event item in the output array. The Responses API Tool Calling Snap provides two output views, similar to the OpenAI Tool Calling Snap, with enhancements to simplify building agent pipelines and support stateful interactions using the previous response ID: Model response view: The complete API response, including extra fields: messages: an empty list if store is enabled, or the full message history—including messages payload and model response—if disabled (similar to the OpenAI Tool Calling Snap). When using stateful workflows, message history isn’t needed because the previous response ID is used to maintain context. has_tool_call: a boolean indicating whether the response includes a tool call. Since the Responses API no longer includes the finish_reason: "tool_calls" field, this new field makes it easier to create stop conditions in the pipeloop Snap within the agent driver pipeline. Tool call view: Displays the list of function calls made by the model during the interaction. Tool Call View of Chat Completions vs Responses API Uses id as the function call identifier when sending back the function result. Tool call properties (name, arguments) are nested inside the function field. Each tool call includes: • id: the unique event ID • call_id: used to reference the function call when returning the result The tool call structure is flat — name and arguments are top-level fields. Building LLM Agent Pipelines To build LLM agent pipelines with the OpenAI Responses API Tool Calling Snap, you can reuse the same agent pipeline pattern described in Introducing Tool Calling Snaps and LLM Agent Pipelines. Only minor configuration changes are needed to support the Responses API. Agent Driver Pipeline The primary change is in the PipeLoop Snap configuration, where the stop condition should now check the has_tool_call field, since the Responses API no longer includes the finish_reason:"tool_calls". Agent Worker Pipeline Fields mapping A Mapper Snap is used to prepare the related fields for the OpenAI Responses API Tool Calling Snap. OpenAI Responses API Tool Calling The key changes are in this Snap’s configuration to support the Responses API’s stateful interactions. There are two supported approaches: Option 1: Use Store (Recommended) Leverages the built-in state management of the Responses API. Enable Store Use Previous Response ID Send only the function call results as the input messages for the next round. (messages field in the Snap’s output will be an empty array, so you can still use it in the Message Appender Snap to collect tool results.) Option 2: Maintain Conversation History in Pipeline Similar to the approach used in the Chat Completions API. Disable Store Include the full message history in the input (messages field in the Snap’s output contains message history) (Optional) Enable Include Reasoning Encrypted Content (for reasoning models) to preserve reasoning context efficiently OpenAI Function Result Generator As explained in Tool Call View of Chat Completions vs Responses API section, the Responses API includes both an id and a call_id. You must use the call_id to construct the function call result when sending it back to the model. Conclusion The OpenAI Responses API makes AI workflows smarter and more adaptable, with stateful interactions and built-in tools. SnapLogic’s OpenAI Responses API Generation and Tool Calling Snaps bring these capabilities directly into your pipelines, letting you take advantage of advanced features like built-in tools and event-based outputs with only minimal adjustments. By integrating these Snaps, you can seamlessly enhance your workflows and fully unlock the potential of the Responses API.
ChompooPanida
14 days ago Place SnapLogic Technical Blog
51Views
0likes
1Comment
Simplify Your LLM Workflows: Integrating Vertex AI RAG with SnapLogic
This document explores the integration of Google Cloud's Vertex AI Retrieval Augmented Generation (RAG) capabilities with SnapLogic. We will delve into how Vertex AI RAG functions, its benefits over traditional vector databases, and practical applications within the SnapLogic platform. The guide will cover setting up and utilizing Vertex AI RAG, automating knowledge feeds, and integrating with SnapLogic's Generate snaps for enhanced LLM performance. Vertex AI RAG Engine The Vertex AI RAG Engine streamlines the retrieval-augmented generation (RAG) process through two primary steps: Knowledge Management: The Vertex AI RAG Engine establishes and maintains a knowledge base by creating a corpus, which serves as an index for storing source files. Retrieval Query: Upon receiving a prompt, the Vertex AI RAG Engine efficiently searches this knowledge base to identify and retrieve information most relevant to the request. The Vertex AI RAG Engine integrates Google Cloud's Vertex AI with the RAG architecture to produce accurate and contextually relevant LLM responses. It covers tasks related to managing knowledge by creating a corpus as an index for source files. For processing, it efficiently retrieves relevant information from this knowledge base when a prompt is received, then leverages the LLM to generate a response based on the retrieved context. Difference between Vector Database While both traditional vector databases and the Vertex AI RAG Engine are designed to enhance LLM responses by providing external knowledge, they differ significantly in their approach and capabilities. Vector Databases Vector databases primarily focus on storing and querying vector embeddings. To use them with an LLM for RAG, you typically need to: Manually manage embedding generation: You are responsible for generating vector embeddings for your source data using an embedding model. Handle retrieval logic: You need to implement the logic for querying the vector database, retrieving relevant embeddings, and then mapping them back to the original source text. Integrate with LLM: The retrieved text then needs to be explicitly passed to the LLM as part of the prompt. No built-in LLM integration: They are agnostic to the LLM and require manual integration for RAG workflows. Vertex AI RAG Engine The Vertex AI RAG Engine offers a more integrated and streamlined solution, abstracting away much of the complexity. Key differences include: Integrated knowledge management: It handles the entire lifecycle of your knowledge base, from ingesting raw source files to indexing and managing the corpus. You don't need to manually generate embeddings or manage vector storage. Automated retrieval: The engine automatically performs the retrieval of relevant information from its corpus based on the user's prompt. Seamless LLM integration: It's designed to work directly with Vertex AI's LLMs, handling the contextualization of the prompt with retrieved information before passing it to the LLM. End-to-end solution: It provides a more comprehensive solution for RAG, simplifying the development and deployment of RAG-powered applications. In essence, a traditional vector database is a component that requires significant orchestration to implement RAG. In contrast, the Vertex AI RAG Engine is a more complete, managed service that simplifies the entire RAG workflow by providing integrated knowledge management, retrieval, and LLM integration. This fundamental benefit allows for a significant simplification of the often complex RAG processing pipeline. By streamlining this process, we can achieve greater efficiency, reduce potential points of failure, and ultimately deliver more accurate and relevant results when leveraging large language models (LLMs) for tasks that require external knowledge. This simplification not only improves performance but also enhances the overall manageability and scalability of RAG-based systems, making them more accessible and effective for a wider range of applications. Using Vertex AI's RAG Engine with Generative AI (instead of directly via the Gemini API) offers advantages. It enhances query-related information retrieval through built-in tools, streamlining data access for generative AI models. This native integration within Vertex AI optimizes information flow, reduces complexity, and leads to a more robust system for retrieval-augmented generation. Vertex AI RAG Engine in SnapLogic SnapLogic now includes a set of Snaps for utilizing the Vertex AI RAG Engine. Corpus Management The following Snaps are available for managing RAG corpora: Google Vertex AI RAG Create Corpus Google Vertex AI RAG List Corpus Google Vertex AI RAG Get Corpus Google Vertex AI RAG Delete Corpus File Management in Corpus The following Snaps enable file management within a RAG corpus: Google Vertex AI RAG Corpus Add File Google Vertex AI RAG Corpus List File Google Vertex AI RAG Corpus Get File Google Vertex AI RAG Corpus Remove File Retrieval For performing retrieval operations, use the following Snap: Google Vertex AI RAG Retrieval Query Try using Vertex AI RAG Let's walk through a practical example of how to leverage the Vertex AI RAG Engine within SnapLogic. This section will guide you through setting up a corpus, adding files, performing retrieval queries, and integrating the results into your LLM applications. Preparing step Before integration, two key steps are required: First, set up a Google Cloud project with enabled APIs, linked billing, and necessary permissions. List of required enabled Google API https://console.cloud.google.com/apis/library/cloudresourcemanager.googleapis.com https://console.cloud.google.com/apis/library/aiplatform.googleapis.com SnapLogic offers two primary methods for connecting to Google Cloud APIs: Service Account (recommended): SnapLogic can utilize an existing Service Account that possesses the necessary permissions. OAuth2: This method requires configuring OAuth2. Access Token: An Access Token is a temporary security credential to access Google Cloud APIs. It requires manual refreshing of the token when it expires. Create the corpus To build the corpus, use the Google Vertex AI RAG Create Corpus Snap. Place the Google Vertex AI RAG Create Corpus Snap. Create Google GenAI Service Account Upload the Service account JSON key file that you obtained from Google Cloud Platform, and then select the project and resource location you want to use. We recommend using the “us-central1” location. Edit the configuration by setting the display name and the Snap execution to "Validate & Execute." Validate the pipeline to obtain the corpus result in the output. If the result is similar to the image above, you now have the corpus ready to add the document. Upload the document To upload documents for Google Vertex AI RAG, integrate SnapLogic using a pipeline connecting the "Google Vertex AI RAG Corpus Add File" and "File Reader" Snaps. The "File Reader" accesses the document, passing its content to the "Google Vertex AI RAG Corpus Add File" Snap, which uploads it to a specified Vertex AI RAG corpus. Example Download the example document. Example file: Basics of SnapLogic.pdf Configure the File Reader Snap as follows: Configure the Corpus Add File Snap as follows: These steps will add the Basics of SnapLogic.pdf to the corpus in the previous section. If you run the pipeline successfully, the output will appear as follows. Retrieve query This section demonstrates how to use the Google Vertex AI RAG Retrieval Query Snap to fetch relevant information from the corpus. This snap takes a user query and returns the most pertinent documents or text snippets. Example From the existing corpus, we will query the question "What snap types does SnapLogic have?" and configure the snap accordingly. The result will display a list of text chunks related to the question, ordered by score value. The score value is calculated from the similarity or distance between the query and each text chunk. The similarity or distance depends on the vectorDB that you choose. By default, the score is the COSINE_DISTANCE. Generate the result Now that we have successfully retrieved relevant information from our corpus, the next crucial step is to leverage this retrieved context to generate a coherent and accurate response using an LLM. This section will demonstrate how to integrate the results from the Google Vertex AI RAG Retrieval Query Snap with a generative AI model, such as the Google Gemini Generate Snap, to produce a final answer based on the augmented information. Here's an example prompt to use in the prompt generator: The final answer will appear as follows: Additionally, the integration between Vertex AI RAG and SnapLogic provides the significant benefit of cross-model compatibility. This means that the established RAG workflows and data retrieval processes can be seamlessly adapted and utilized with different large language models beyond just Vertex AI, such as open-source models or other commercial LLMs. This flexibility allows organizations to leverage their investment in RAG infrastructure across a diverse ecosystem of AI models, enabling greater adaptability, future-proofing of applications, and the ability to choose the best-suited LLM for specific tasks without rebuilding the entire information retrieval pipeline. This cross-model benefit ensures that the RAG solution remains versatile and valuable, regardless of evolving LLM landscapes. Auto-retrieve query with the Vertex AI built-in tool Using the built-in tool in the Vertex AI Gemini Generate Snap for auto-retrieval significantly simplifies the RAG pipeline. Instead of manually performing a retrieval query and then passing the results to a separate generation step, this integrated approach allows the Gemini model to automatically consult the configured RAG corpus based on the input prompt. This reduces the number of steps and the complexity of the pipeline, as the retrieval and generation processes are seamlessly handled within a single Snap. It ensures that the LLM always has access to the most relevant contextual information from your knowledge base without requiring explicit orchestration, leading to more efficient and accurate content generation. The snap configuration example below demonstrates how to configure the Built-in tools section. Specifically, we select the vertexRagStore type and designate the target corpus. The final answer generated using the auto-retrieval process will be displayed below. The response includes grounding metadata for source tracking, allowing users to trace information origins. This feature enhances transparency, fact-verification, and builds trust in content accuracy and reliability. Users can delve into source material, cross-reference facts, and gain a complete understanding, boosting the system's utility and trustworthiness. Summary This document demonstrates how to integrate Google Cloud's Vertex AI Retrieval Augmented Generation (RAG) capabilities with SnapLogic to enhance LLM workflows. Key takeaways include: Streamlined RAG Process: Vertex AI RAG simplifies knowledge management and retrieval, abstracting away complexities like manual embedding generation and retrieval logic, which are typically required with traditional vector databases. Integrated Solution: Unlike standalone vector databases, Vertex AI RAG offers an end-to-end solution for RAG, handling everything from ingesting raw files to integrating with LLMs. SnapLogic Integration: SnapLogic provides dedicated Snaps for managing Vertex AI RAG corpora (creating, listing, getting, deleting), managing files within corpora (adding, listing, getting, removing), and performing retrieval queries. Practical Application: The guide provided a step-by-step example of setting up a corpus, uploading documents, performing retrieval queries using the Google Vertex AI RAG Retrieval Query Snap, and integrating the results with generative AI models like the Google Gemini Generate Snap for contextually accurate responses. Cross-Model Compatibility: A significant benefit of this integration is the ability to adapt established RAG workflows and data retrieval processes with various LLMs beyond just Vertex AI, including open-source and other commercial models, ensuring flexibility and future-proofing. Automated Retrieval with Built-in Tools: The integration allows for automated retrieval via built-in tools in the Vertex AI Gemini Generate Snap, simplifying the RAG pipeline by handling retrieval and generation seamlessly within a single step. By leveraging Vertex AI RAG with SnapLogic, organizations can simplify the development and deployment of RAG-powered applications, leading to more accurate, contextually relevant, and efficient LLM responses.
pawit_roy
19 days ago Place SnapLogic Technical Blog
48Views
0likes
0Comments
More Than Just Fast: A Holistic Guide to High-Performance AI Agents
At SnapLogic, while building and refining an AI Agent for a large customer in the healthcare industry, we embarked on a journey of holistic performance optimization. We didn't just want to make it faster. We tried to make it better across the board. This journey taught us that significant gains are found by looking at the entire system, from the back-end data sources to the pixels on the user's screen. Here’s our playbook for building a truly high-performing AI agent, backed by real-world metrics. The Foundation: Data and Architecture Before you can tune an engine, you have to build it on a solid chassis. For an AI Agent, that chassis is its core architecture and its relationship with data. Choose the Right Brain for the Job: Not all LLMs are created equal. The "best" model depends entirely on the nature of the tasks your agent needs to perform. A simple agent with one or two tools has very different requirements from a complex agent that needs to reason, plan, and execute dynamic operations. Matching the model to the task complexity is key to balancing cost, speed, and capability. Task Complexity Model Type Characteristics & Best For Simple, Single-Tool Tasks Fast & Cost-Effective Goal: Executing a well-defined task with a limited toolset (e.g., simple data lookups, classification). These models are fast and cheap, perfect for high-volume, low-complexity actions. Multi-Tool Orchestration Balanced Goal: Reliably choosing the correct tool from several options and handling moderately complex user requests. These models offer a great blend of speed, cost, and improved instruction-following for a good user experience. Complex Reasoning & Dynamic Tasks High-Performance / Sophisticated Goal: Handling ambiguous requests that require multi-step reasoning, planning, and advanced tool use like dynamic SQL query generation. These are the most powerful (and expensive) models, essential for tasks where deep understanding and accuracy are critical. Deconstruct Complexity with a Multi-Agent Approach: A single, monolithic agent designed to do everything can become slow and unwieldy. A more advanced approach is to break down a highly complex agent into a team of smaller, specialized agents. This strategy offers two powerful benefits: It enables the use of faster, cheaper models. Each specialized agent has a narrower, more defined task, which often means you can use a less powerful (and faster) LLM for that specific job, reserving your most sophisticated model for the "manager" agent that orchestrates the others. It dramatically increases reusability. These smaller, function-specific agents and their underlying tools are modular. They can be easily repurposed and reused in the next AI Agent you build, accelerating future development cycles. Set the Stage for Success with Data: An AI Agent is only as good as the data it can access. We learned that optimizing data access is a critical first step. This involved: Implementing Dynamic Text-to-SQL: Instead of relying on rigid, pre-defined queries, we empowered the agent to build its own SQL queries dynamically from natural language. This flexibility required a deep initial investment in analyzing and understanding the critical columns and data formats our agent would need from sources like Snowflake. Generating Dedicated Database Views: To support the agent, we generated dedicated views on top of our source tables. This strategy serves two key purposes: it dramatically reduces query times by pre-joining and simplifying complex data, and it allows us to remove sensitive or unnecessary data from the source, ensuring the agent only has access to what it needs. Pre-loading the Schema for Agility: Making the database schema available to the agent is critical for accurate dynamic SQL generation. To optimize this, we pre-load the relevant schemas at startup. This simple step saves precious time on every single query the agent generates, contributing significantly to the overall responsiveness. The Engine: Tuning the Agent’s Logic and Retrieval Our Diagnostic Toolkit: Using AI to Analyze AI Before we could optimize the engine, we needed to know exactly where the friction was. Our diagnostic process followed a two-step approach: High-Level Analysis: We started in the SnapLogic Monitor, which provides a high-level, tabular view of all pipeline executions. This dashboard is the starting point for any performance investigation. As you can see below, it gives a list of all runs, their status, and their total duration. By clicking the Download table button, you can export this summary data as a CSV. This allows for a quick, high-level analysis to spot outliers and trends without immediately diving into verbose log files. AI-Powered Deep Dive: Once we identified a bottleneck from the dashboard—a pipeline that was taking longer than expected—we downloaded the detailed, verbose log files for those specific pipeline runs. We then fed these complex logs into an AI tool of our choice. This "AI analyzing AI" approach helped us instantly pinpoint key issues that would have taken hours to find manually. For example, this process uncovered an unnecessary error loop caused by duplicate JDBC driver versions, which significantly extended the execution time of our Snowflake Snaps. Fixing this single issue was a key factor in the 68% performance improvement we saw when querying our technical knowledge base. With a precise diagnosis in hand, we turned our attention to the agent's "thinking" process. This is where we saw some of our most dramatic performance gains. How We Achieved This: Crafting the Perfect Instructions (System Prompts): We transitioned from generic prompts to highly customized system prompts, optimized for both the specific task and the chosen LLM. A simpler model gets a simpler, more direct prompt, while a sophisticated model can be instructed to "think step-by-step" to improve its reasoning. A Simple Switch for Production Speed: One of the most impactful, low-effort optimizations came from how we use a key development tool: the Record Replay Snap. During the creation and testing of our agent's pipelines, this Snap is invaluable for capturing and replaying data, but it adds about 2.5 seconds of overhead to each execution. For a simple agent run involving a driver, a worker, and one tool, this adds up to 7.5 seconds of unnecessary latency in a production environment. Once our pipelines were successfully tested, we switched these Snaps to "Replay Only" mode. This simple change instantly removed the recording overhead, providing a significant speed boost across all agent interactions. Smarter, Faster Data Retrieval (RAG Optimization): For our Retrieval-Augmented Generation (RAG) tools, we focused on two key levers: Finding the Sweet Spot (k value): We tuned the k value—the number of documents retrieved for context. For our product information retrieval use case, adjusting this value was the key to our 63% speed improvement. It’s the art of getting just enough context for an accurate answer without creating unnecessary work for the LLM. Surgical Precision with Metadata: Instead of always performing a broad vector search, we enabled the agent to use metadata. If it knows a document's unique_ID, it can fetch that exact document. This is the difference between browsing a library and using a call number. It's swift and precise. Ensuring Consistency: We set the temperature to a low value during the data extraction and indexing process. This ensures that the data chunks are created consistently, leading to more reliable and repeatable search results. The Results: A Data-Driven Transformation Our optimization efforts led to significant, measurable improvements across several key use cases for the AI Agent. Use Case Before Optimization After Optimization Speed Improvement Querying Technical Knowledge Base 92 seconds 29 seconds ~68% Faster Processing Sales Order Data 32 seconds 10.7 seconds ~66% Faster RAG Retrieval 5.8 seconds 2.1 seconds ~63% Faster Production Optimization (Replay Only) 20 seconds 17.5 seconds ~12% Faster* (*This improvement came from switching development Snaps to a production-ready "Replay Only" mode, removing the latency inherent to the testing phase.) The Experience: Focusing on the User Ultimately, all the back-end optimization in the world is irrelevant if the user experience is poor. The final layer of our strategy was to focus on the front-end application. Engage, Don't Just Wait: A simple "running..." message can cause user anxiety and make any wait feel longer. Our next iteration will provide a real-time status of the agent's thinking process (e.g., "Querying product database...", "Synthesizing answer..."). This transparency keeps the user engaged and builds trust. Guide the User to Success: We learned that a blank text box can be intimidating. By providing predefined example prompts and clearly explaining the agent's capabilities, we guide the user toward successful interactions. Deliver a Clear Result: The final output must be easy to consume. We format our results cleanly, using tables, lists, and clear language to ensure the user can understand and act on the information instantly. By taking this holistic approach, we optimized the foundation, the engine, and the user experience to build an AI Agent that doesn't just feel fast. It feels intelligent, reliable, and genuinely helpful.
kriegel
28 days ago Place SnapLogic Technical Blog
24Views
0likes
0Comments
Bridging Legacy OPC Classic Servers(DA, AE, HDA) to SnapLogic via OPC UA Wrapper
Despite significant advances in industrial automation, many critical devices still rely on legacy OPC Classic servers (DA, AE, HDA). Integrating these aging systems with modern platforms presents challenges such as protocol incompatibility and the absence of native OPC UA support. Meanwhile, modern integration and analytics platforms increasingly depend on OPC UA for secure, scalable connectivity. This post addresses these challenges by demonstrating how the OPC UA Wrapper can seamlessly bridge OPC Classic servers to SnapLogic. Through a practical use case—detecting missing reset anomalies in saw-toothed wave signals from an OPC Simulation DA Server—you’ll discover how to enable real-time monitoring and alerting without costly infrastructure upgrades
Ashok
3 months ago Place SnapLogic Technical Blog
239Views
4likes
2Comments
Scalable Analytics Platform: A Data Engineering Journey
Scalable Analytics Platform: A Data Engineering Journey - Explore SnapLogic's innovative Medallion Architecture approach for handling massive data, improving analytics with S3, Trino, and Amazon Neptune. Learn about cost reduction, scalability, data governance, and enhanced insights.
Indrajeet
3 months ago Place SnapLogic Technical Blog
200Views
2likes
0Comments
Industrial IoT – Turbine Lubrication Oil Level Monitoring & Alert Mechanism via OPC UA and SnapLogic
In the energy sector, turbine lubrication oil is mission-critical. A drop in oil level or pressure can silently escalate into major failures, unplanned shutdowns, and expensive maintenance windows. In this blog, we showcase a real-world implementation using SnapLogic and OPC UA, designed to: 🔧 Continuously monitor turbine lubrication oil levels 📥 Ingest real-time sensor data from industrial systems 📊 Store telemetry in data lakes for analytics and compliance 📣 Real-time Slack alerts to engineers — before failures strike This IIoT-driven solution empowers energy providers to adopt predictive maintenance practices and reduce operational risk
Ashok
4 months ago Place SnapLogic Technical Blog
269Views
2likes
1Comment
Industrial IoT – OPC UA Real-Time Motor Overheat Detection and Auto-Shutdown Using SnapLogic
Industrial motors are critical assets in manufacturing and process industries, where overheating can result in costly downtime or catastrophic failure. In this blog, we demonstrate how SnapLogic and OPC UA were used to build a real-time, event-driven pipeline that detects motor overheating, initiates an automated shutdown, logs events for auditing, and notifies the maintenance/engineering team
Ashok
4 months ago Place SnapLogic Technical Blog
229Views
3likes
0Comments
APIM Function Generator - Integrate Your APIM Services as Tools for LLMs
As large language models (LLMs) continue to evolve, they are becoming powerful enablers of intelligent automation, capable of interpreting natural language and performing actions through external tool calling. In SnapLogic, this capability is supported through LLM agent pipelines, where function definitions—whether created manually, built from Snaplogic pipelines , or derived from OpenAPI specifications—can be transformed into callable tools that LLMs use to interact with external systems. SnapLogic’s API Management (APIM) plays a key role in this process. APIM offers a modern and flexible way to design, secure, and publish APIs directly within the Snaplogic platform. With support for service versioning, policy configuration, and a DeveloperHub for API discovery, APIM simplifies the management and reuse of APIs across various projects and use cases. Many users already rely on APIM to expose critical business logic through well-structured, reusable APIs that support a wide range of integrations and workflows. However, connecting these APIs to LLM tool calling has traditionally involved significant manual effort, including: Difficulty discovering APIM service versions and endpoints at the Snap level Rewriting API descriptions and parameter definitions already defined in the APIM service Configuring multiple Function Generator Snaps Manually maintaining tool definitions to stay in sync with updates to the APIM service The APIM Function Generator Snap addresses these challenges by automating the entire setup process. It connects directly to an APIM service version and generates a list of Snaplogic internal format of the function definitions using live metadata to use with the Tool Calling Snap and related downstream Snaps. By removing the need for manual configuration and automatically reflecting any updates made to the APIM service, this Snap helps maintain consistency, reduce setup time, and simplify integration. It provides a fast and reliable way to turn an existing APIM service version into callable tools for LLMs, enabling more efficient automation. APIM Function Generator Snap Snap Properties Project path: Select the project path that contains the target APIM service. Service name: Choose the service name to use. Suggestions are provided based on the selected project path. Version: Specify the version of the selected service. Suggestions are provided based on the selected service name. Base URL: Specify the base URL used to call the API. Suggestions are provided based on the Plex selected in the APIM service version. Preferred content type: Specify the desired content type for API calls. Filter type: Determines how the APIs within the service version are selected. Aggregate input: When enabled, all incoming documents are combined into a single list of function definitions. Filter Type Options The Snap supports three filter types for determining which APIs from the selected APIM service version are included in the function definitions: Use all paths This option includes every available API path and method from the selected service version. It is the simplest way to expose the entire API set without additional filtering. Tags This option allows you to specify one or more tags. Any API that includes at least one of the specified tags—whether at the endpoint, path, or method level—will be included in the generated function definitions. Tag-based filtering is useful for organizing APIs into meaningful groups. For example, you can create separate tool sets for different agents, such as billing, ticketing, analytics, or read-only access. Paths and methods This option enables you to define specific paths and their associated HTTP methods. It offers precise control over which endpoints are exposed to the LLM agent. This filter type is especially valuable for enforcing permissions; for instance, by including only GET methods, you can restrict an agent to read-only operations. This flexible configuration allows you to tailor the function definitions to match your use case, ensuring the right APIs are available for your LLM agent workflows. Output of the Snap The Snap outputs a list of tool definitions. Each tool definition includes: sl_tool_metadata: Contains essential metadata used by the Tool Calling Snap and downstream Snaps to route and invoke the correct API. It enables dynamic resolution of the URL path based on parameters from the LLM output. The tool_type is set to APIM, and the metadata includes the service name, version name, endpoint name, tags(including endpoint tags, path tags, and method tags), and the full URL of the API. json_schema: defines the expected structure of input parameters for the API. This schema guides the LLM in formatting its output correctly and allows the Tool Calling Snap to resolve the complete URL path for calling the API. These fields are generated based on the live metadata from the APIM service version. Example Use Case: Product Recommendation To demonstrate how the APIM Function Generator Snap streamlines the integration of existing APIs into LLM workflows, consider a scenario based on a retail business use case. In this example, the SnapLogic APIM service named APIM Tool Demo, version v1, is used to expose APIs for a clothing store. This service includes three endpoints: /weather: A proxied external API used to retrieve real-time weather data. /inventory: A SnapLogic pipeline task exposed as an API to access product data from the store’s database. /users: Another SnapLogic task API that provides access to user-related information. These APIs might already be powering external applications or internal tools. With the APIM Function Generator Snap, they can now be easily transformed into callable tools for LLMs—eliminating the need for separate specifications or manual configuration. In this use case, the goal is to build an LLM agent capable of recommending clothing items based on the current weather. For example, the LLM may receive a prompt such as: “Please recommend at most 3 items from the women’s clothing category in the store based on the current weather in Paris and provide the reason.” To support this workflow, the /weather endpoint and some APIs in /inventory endpoint were tagged with recommendation in the APIM service. This tag is then used in the APIM Function Generator Snap to filter and include only the relevant APIs when generating tool definitions—making it easy to group related functionality. Agent Driver Pipeline System prompt: You are a friendly and helpful assistant that are equipped with some tools, please use your judge to decide whether to use the tools or not to answer questions from the user. User prompt: Please recommend at most 3 items from the women’s clothing category in the store based on the current weather in Paris and provide the reason. Agent Worker Pipeline Function generation The APIM Function Generator is used for providing the tool definition to the Tool Calling Snap Snap settings: Output: Tool Calling The Tool Calling Snap sends tool definitions to the LLM and outputs tool calls with the full URL and headers resolved using parameters returned by the LLM. This output is then passed to the Router Snap to route each call to the appropriate downstream Snap. Router The Router Snap is useful when different endpoints require specific handling, such as different authentication methods, custom configurations, or when using multiple services or types of Function Generator Snaps. It leverages fields from sl_tool_metadata, such as the endpoint name, to route each tool call to the appropriate path. Example route expression: $.get("sl_tool_metadata").tool_type == "APIM" && $.get("sl_tool_metadata").service_name == "APIMToolsDemo" && $.get("sl_tool_metadata").endpoint_name == "weather" Calling API The API is called using the HTTP Client Snap. The request method and URL are dynamically configured using sl_tool_metadata. The appropriate account should be selected according to the policies defined in the APIM service. With this setup, the LLM can respond to queries by utilizing the APIM tools. Below is the message history, demonstrating that the tool can be called successfully. Final Result Example pipelines Recommendation agent driver Recommendation agent worker Conclusion The APIM Function Generator Snap simplifies the integration of APIM services as tools for LLMs by automating function generation and keeping definitions in sync with live API metadata. This streamlined approach eliminates manual setup, supports flexible filtering, and transforms existing APIs into callable tools, enabling dynamic and metadata-driven automation within LLM workflows.
ChompooPanida
4 months ago Place SnapLogic Technical Blog
183Views
0likes
0Comments
A Guide to the Enhanced PipelineLoop
Introduction As integration demands grow increasingly latency-sensitive—particularly in pipelines leveraging large language model (LLM) Snaps—minimizing execution delays has become critical to maintaining performance at scale. In response, SnapLogic has released an enhanced version of the PipelineLoop Snap designed to address two key performance bottlenecks: startup overhead and document output latency. This article provides a detailed overview of the enhancements, explains their impact on real-world tool-calling pipelines, and offers practical guidance on how to configure and apply these features to unlock immediate performance improvements. What’s New in the Enhanced PipelineLoop? 1. Optimized Document-Output Time The PipelineLoop Snap has been enhanced with a shorter and more efficient polling interval, allowing it to detect completed child pipelines sooner and push output documents with less delay. While the mechanism remains polling-based rather than fully event-driven, the reduced wait time significantly lowers output latency—particularly in workloads with many iterations. 2. Pre-Spawned Pipeline In traditional PipelineLoop behavior, a new child pipeline is initialized for each incoming document. While this works well for lightweight tasks, it becomes inefficient for tool-calling workloads where initialization time can be significant. To address this, the enhanced PipelineLoop introduces the ability to maintain a pool of pre-spawned child pipelines that are initialized in advance and reused across iterations. The number of warm pipelines is controlled by the Pre-spawned pipelines property. As each child pipeline completes, a new one is automatically initialized in the background to keep the pool full, unless the total iteration count is fewer than the configured pool size. When the loop finishes, any idle pipelines are shut down gracefully. This feature is particularly useful in scenarios where child pipelines are large or involve time-consuming setup steps—such as opening multiple account connections or loading complex Snaps. By setting Pre-spawned pipelines to a value greater than one, the PipelineLoop can eliminate the cold-start delay for most documents, improving throughput under high-traffic conditions. The following steps provide a way to configure the pipeloop with pre-spawned pipeline. Pre-Spawned Pipeline Walkthrough This walkthrough demonstrates how the Pre-spawned Pipelines feature works in practice, using a simple parent-child pipeline configuration. 1. Parent Pipeline Setup The parent pipeline includes a Sequence Snap configured to emit one input document. It uses a PipelineLoop Snap with the following settings: Iteration limit: 10 Pre-spawned pipelines: 3 2. Child Pipeline Setup The child pipeline contains a Script Snap that introduces a 2-second delay before incrementing the incoming value by 1. This simulates a minimal processing task with a noticeable execution time, ideal for observing the impact of pre-spawning. Here is the script. This script processes each input document by adding one to the "value" field after a simulated 2-second delay. The purpose of the delay is not to simulate real processing latency, but rather to make the pre-spawned pipeline activity clearly observable in the SnapLogic Dashboard or runtime logs. 3. Pipeline Execution When the parent pipeline is executed, the system immediately initializes three child pipeline instances in advance—these are the pre-spawned workers. As each finishes processing a document, the PipelineLoop reuses or replenishes workers as needed, maintaining the pool up to the configured limit. 4. Controlled Execution Count Despite having pre-spawned workers, the PipelineLoop respects the iteration limit of 10, ensuring that no more than 10 child executions occur in total. Once all 10 iterations complete, the loop shuts down all idle child pipelines gracefully. This setup highlights the benefit of pre-initialized pipelines in reducing execution latency, particularly for scenarios where child pipeline startup time contributes significantly to overall performance. 3. Parallel Execution Support The enhanced PipelineLoop Snap introduces parallel execution, allowing multiple input documents to be processed simultaneously across separate child pipeline instances. This capability is especially beneficial when dealing with high-throughput workloads, where processing documents one at a time would create unnecessary bottlenecks. By configuring the Parallel executions property, users can define how many input documents should be handled concurrently. For example, setting the value to 3 enables the loop to initiate and manage up to three loop executions at once, significantly improving overall pipeline throughput. Importantly, this parallelism is implemented without compromising result consistency. The PipelineLoop maintains output order alignment, ensuring that results are delivered in the exact sequence that input documents were received—regardless of the order in which child pipelines complete their tasks. This makes the feature safe to use even in downstream flows that rely on ordered data. Parallel execution is designed to maximize resource utilization and minimize processing delays, providing a scalable solution for data-intensive and latency-sensitive integration scenarios. Parallel Execution Walkthrough This walkthrough illustrates how the Parallel Execution capability in the enhanced PipelineLoop Snap improves performance while preserving input order. A basic parent-child pipeline setup is used to demonstrate the behavior. 1. Parent Pipeline Configuration The parent pipeline begins with a Sequence Snap that generates three input documents. A PipelineLoop Snap is configured with the following parameters: Iteration limit: 3 Parallel executions: 3 This setup allows the PipelineLoop to process all three input documents concurrently. 2. Child Pipeline Configuration The child pipeline consists of a Script Snap that introduces a 2-second delay before appending the character "A" to the "value" field of each document. Below is the script used in the Script Snap: The delay is intentionally added to visualize the effect of parallel processing, making the performance gains more noticeable during monitoring or testing. 3. Execution Behavior Upon execution, the PipelineLoop initiates three child pipeline instances in parallel—one for each input document. Although each child pipeline includes a 2-second processing delay, the overall execution completed in approximately 9 seconds which is still significantly faster than the optimal serial execution time of 18 seconds. While the theoretical runtime under ideal parallel conditions would be around 6 seconds, real-world factors such as Snap initialization time and API latency can introduce minor overhead. Despite this, the result demonstrates effective concurrency and highlights the performance benefits of parallel execution in practical integration scenarios. Most importantly, even with concurrent processing, the output order remains consistent with the original input sequence. This confirms that the PipelineLoop’s internal queuing mechanism correctly aligns results, ensuring reliable downstream processing.
BankTanapat
4 months ago Place SnapLogic Technical Blog
178Views
0likes
0Comments