SnapLogic MCP Support
Introduction Since the inception of the Model Context Protocol (MCP), we've been envisioning and designing how it can be integrated into the SnapLogic platform. We've recently received a significant number of inquiries about MCP, and we're excited to share our progress, the features we'll be supporting, our release timeline, and how you can get started creating MCP servers and clients within SnapLogic. If you're interested, we encourage you to reach out! Understanding the MCP Protocol The MCP protocol allows tools, data resources, and prompts to be published by an MCP server in a way that Large Language Models (LLMs) can understand. This empowers LLMs to autonomously interact with these resources via an MCP client, expanding their capabilities to perform actions, retrieve information, and execute complex workflows. MCP Protocol primarily supports: Tools: Functions an LLM can invoke (e.g., data lookups, operational tasks). Resources: File-like data an LLM can read (e.g., API responses, file contents). Prompts: Pre-written templates to guide LLM interaction with the server. Sampling (not widely used): Allows client-hosted LLMs to be used by remote MCP servers. An MCP client can, therefore, request to list available tools, call specific tools, list resources, or read resource content from a server. Transport and Authentication MCP protocol offers flexible transport options, including STDIO or HTTP (SSE or Streamable-HTTP) for local deployments, and HTTP (SSE or Streamable-HTTP) for remote deployments. While the protocol proposes OAuth 2.1 for authentication, an MCP server can also use custom headers for security. Release Timeline We're excited to bring MCP support to SnapLogic with two key releases: August Release: MCP Client Support We'll be releasing two new snaps: the MCP Function Generator Snap and the MCP Invoke Snap. These will be available in the AgentCreator Experimental (Beta) Snap Pack. With these Snaps, your SnapLogic agent can access the services and resources available on the public MCP server. Late Q3 Release: MCP Server Support Our initial MCP server support will focus on tool operations, including the ability to list tools and call tools. For authentication, it will support custom header-based authentication. Users will be able to leverage the MCP Server functionality by subscribing to this feature. If you're eager to be among the first to test these new capabilities and provide feedback, please reach out to the Project Manager Team, at pm-team@snaplogic.com. We're looking forward to seeing what you build with SnapLogic MCP. SnapLogic MCP Client MCP Clients in SnapLogic enable users to connect to MCP servers as part of their Agent. An example can be connecting to the Firecrawl MCP server for a data scraping Agent, or other use cases that can leverage the created MCP servers. The MCP Client support in SnapLogic consists of two Snaps, the MCP Function Generator Snap and the MCP Invoke Snap. From a high-level perspective, the MCP Function generator Snap allows users to list available tools from an MCP server, and the MCP Invoke Snap allows users to perform operations such as call tools, list resources, and read resources from an MCP server. Let’s dive into the individual pieces. MCP SSE Account To connect to an MCP Server, we will need an account to specify the URI of the server to connect to. Properties URI The URI of the server to connect to. Don’t need to include the /sse path Additional headers Additional HTTP headers to be sent to the server Timeout The timeout value in seconds, if the result is not returned within the timeout, the Snap will return an error. MCP Function Generator Snap The MCP Function Generator Snap enables users to retrieve the list of tools as SnapLogic function definitions to be used in a Tool Calling Snap. Properties Account An MCP SSE account is required to connect to an MCP Server. Expose Tools List all available tools from an MCP server as SnapLogic function definitions Expose Resources Add list_resources, read_resource as SnapLogic function definitions to allow LLMs to use resources/read and resources/list (MCP Resources). Definitions for list resource and read resource [ { "sl_type": "function", "name": "list_resources", "description": "This function lists all available resources on the MCP server. Return a list of resources with their URIs.", "strict": false, "sl_tool_metadata": { "operation": "resources/list" } }, { "sl_type": "function", "name": "read_resource", "description": "This function returns the content of the resource from the MCP server given the URI of the resource.", "strict": false, "sl_tool_metadata": { "operation": "resources/read" }, "parameters": [ { "name": "uri", "type": "STRING", "description": "Unique identifier for the resource", "required": true } ] } ] MCP Invoke Snap The MCP Invoke Snap enables users to perform operations such as tools/call, resources/list, and resources/read to an MCP server. Properties Account An account is required to use the MCP Invoke Snap Operation The operation to perform on the MCP server. The operation must be one of tools/call, resources/list, or resources/read Tool Name The name of the tool to call. Only enabled and required when the operation is tools/call Parameters The parameters to be added to the operation. Only enabled for resources/read and tools/call. Required for resources/read, and optional for tools/call, based on the tool. MCP Agents in pipeline action MCP Agent Driver pipeline An MCP Agent Driver pipeline is like any other MCP Agent pipeline; we’ll need to provide the system prompt, user prompt, and run it with the PipeLoop Snap. MCP Agent Worker pipeline Here’s an example of an MCP Agent with a single MCP Server connection. The MCP Agent Worker is connected to one MCP Server. MCP Client Snaps can be used together with AgentCreator Snaps, such as the Multi-Pipeline Function Generator and Pipeline Execute Snap, as SnapLogic Functions, tools. This allows users to use tools provided by MCP servers and internal tools, without sacrificing safety and freedom when building an Agent. Agent Worker with MCP Client Snaps SnapLogic MCP Server In SnapLogic, an MCP Server allows you to expose SnapLogic pipelines as dynamic tools that can be discovered and invoked by language models or external systems. By registering an MCP Server, you effectively provide a API that language models and other clients can use to perform operations such as data retrieval, transformation, enrichment, or automation, all orchestrated through SnapLogic pipelines. For the initial phase, we'll support connections to the server via HTTP + SSE. Core Capabilities The MCP Server provides two core capabilities. The first is listing tools, which returns structured metadata that describes the available pipelines. This metadata includes the tool name, a description, the input schema in JSON Schema format, and any additional relevant information. This allows clients to dynamically discover which operations are available for invocation. The second capability is calling tools, where a specific pipeline is executed as a tool using structured input parameters, and the output is returned. Both of these operations—tool listing and tool calling—are exposed through standard JSON-RPC methods, specifically tools/list and tools/call, accessible over HTTP. Prerequisite You'll need to prepare your tool pipelines in advance. During the server creation process, these can be added and exposed as tools for external LLMs to use. MCP Server Pipeline Components A typical MCP server pipeline consists of four Snaps, each with a dedicated role: 1. Router What it does: Routes incoming JSON requests—which differ from direct JSON-RPC requests sent by an MCP client—to either the list tools branch or the call tool branch. How: Examines the request payload (typically the method field) to determine which action to perform. 2. Multi-Pipeline Function Generator (Listing Tools) What it does: Converts a list of pipeline references into tool metadata. This is where you define the pipelines you want the server to expose as tools. Output: For each pipeline, generates: Tool name Description Parameters (as JSON Schema) Other metadata Purpose: Allows clients (e.g., an LLM) to query what tools are available without prior knowledge. 3. Pipeline Execute (Calling Tools) What it does: Dynamically invokes the selected SnapLogic pipeline and returns structured outputs. How: Accepts parameters encoded in the request body, maps them to the pipeline’s expected inputs, and executes the pipeline. Purpose: Provides flexible runtime execution of tools based on user or model requests. 4. Union What it does: Merges the result streams from both branches (list and call) into a single output stream for consistent response formatting. Request Flows Below are example flows showing how requests are processed: 🟢 tools/list Client sends a JSON-RPC request with method = "tools/list". Router directs the request to the Multi-Pipeline Function Generator. Tool metadata is generated and returned in the response. Union Snap merges and outputs the content. ✅ Result: The client receives a JSON list describing all available tools. �� tools/call Client sends a JSON-RPC request with method = "tools/call" and the tool name + parameters. Router sends this to the Pipeline Execute Snap. The selected pipeline is invoked with the given parameters. Output is collected and merged via Union. ✅ Result: The client gets the execution result of the selected tool. Registering an MCP Server Once your MCP server pipeline is created: Create a Trigger Task and Register as an MCP Server Navigate to the Designer > Create Trigger Task Choose a Groundplex. (Note: This capability currently requires a Groundplex, not a Cloudplex.) Select your MCP pipeline. Click Register as MCP server Configure node and authentication. Find your MCP Server URL Navigate to the Manager > Tasks The Task Details page exposes a unique HTTP endpoint. This endpoint is treated as your MCP Server URL. After registration, clients such as AI models or orchestration engines can interact with the MCP Server by calling the /tools/list endpoint to discover the available tools, and the /tools/call endpoint to invoke a specific tool using a structured JSON payload. Connect to a SnapLogic MCP Server from a Client After the MCP server is successfully published, using the SnapLogic MCP server is no different from using other MCP servers running in SSE mode. It can be connected to by any MCP client that supports SSE mode; all you need is the MCP Server URL (and the Bearer Token if authentication is enabled during server registration). Configuration First, you need to add your MCP server in the settings of the MCP client. Taking Claude Desktop as an example, you'll need to modify your Claude Desktop configuration file. The configuration file is typically located at: macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json Add your remote MCP server configuration to the mcpServers section: { "mcpServers": { "SL_MCP_server": { "command": "npx", "args": [ "mcp-remote", "http://devhost9000.example.com:9000/mcp/6873ff343a91cab6b00014a5/sse", "--header", "Authorization: Bearer your_token_here" ] } } } Key Components Server Name: SL_MCP_server - A unique identifier for your MCP server Command: npx - Uses the Node.js package runner to execute the mcp-remote package URL: The SSE endpoint URL of your remote MCP server (note the /sse suffix) Authentication: Use the --header flag to include authorization tokens if the server enabled authentication Requirements Ensure you have Node.js installed on your system, as the configuration uses npx to run the mcp-remote package. Replace the example URL and authorization token with your actual server details before saving the configuration. After updating the configuration file, restart Claude Desktop for the changes to take effect. To conclude, the MCP Server in SnapLogic is a framework that allows you to expose pipelines as dynamic tools accessible through a single HTTP endpoint. This capability is designed for integration with language models and external systems that need to discover and invoke SnapLogic workflows at runtime. MCP Servers make it possible to build flexible, composable APIs that return structured results, supporting use cases such as conversational AI, automated data orchestration, and intelligent application workflows. Conclusion SnapLogic's integration of the MCP protocol marks a significant leap forward in empowering LLMs to dynamically discover and invoke SnapLogic pipelines as sophisticated tools, transforming how you build conversational AI, automate complex data orchestrations, and create truly intelligent applications. We're excited to see the innovative solutions you'll develop with these powerful new capabilities.553Views0likes0CommentsOpenAI Responses API
Introduction OpenAI announced the Responses API, their most advanced and versatile interface for building intelligent AI applications. Supporting both text and image inputs with rich text outputs, this API enables dynamic, stateful conversations that remember and build on previous interactions, making AI experiences more natural and context-aware. It also unlocks powerful capabilities through built-in tools such as web search, file search, code interpreter, and more, while enabling seamless integration with external systems via function calling. Its event-driven design delivers clear, structured updates at every step, making it easier than ever to create sophisticated, multi-step AI workflows. Key features include: Stateful conversations via the previous response ID Built-in tools like web search, file search, code interpreter, MCP, and others Access to advanced models available exclusively, such as o1-pro Enhanced support for reasoning models with reasoning summaries and efficient context management through previous response ID or encrypted reasoning items Clear, event-based outputs that simplify integration and control While the Chat Completions API remains fully supported and widely used, OpenAI plans to retire the Assistants API in the first half of 2026. To support the adoption of the Responses API, two new Snaps have been introduced: OpenAI Chat Completions ⇒ OpenAI Responses API Generation OpenAI Tool Calling ⇒ OpenAI Responses API Tool Calling Both Snaps are fully compatible with existing upstream and downstream utility Snaps, including the OpenAI Prompt Generator, OpenAI Multimodal Content Generator, all Function Generators (Multi-Pipeline, OpenAPI, and APIM), the Function Result Generator, and the Message Appender. This allows existing pipelines and familiar development patterns to be reused while gaining access to the advanced features of the Responses API. OpenAI Responses API Generation The OpenAI Responses API Generation Snap is designed to support OpenAI’s newest Responses API, enabling more structured, stateful, and tool-augmented interactions. While it builds upon the familiar interface of the Chat Completions Snap, several new properties and behavioral updates have been introduced to align with the Responses API’s capabilities. New properties Message: The input sent to the LLM. This field replaces the previous Use message payload, Message payload, and Prompt properties in the OpenAI Chat Completions Snap, consolidating them into a single input. It removes ambiguity between "prompt" as raw text and as a template, and supports both string and list formats. Previous response ID: The unique ID of the previous response to the model. Use this to create multi-turn conversations. Model parameters Reasoning summary: For reasoning models, provides a summary of the model’s reasoning process, aiding in debugging and understanding the model's reasoning process. The property can be none, auto, or detailed. Advanced prompt configurations Instructions: Applied only to the current response, making them useful for dynamically swapping instructions between turns. To persist instructions across turns when using previous_response_id, the developer message in the OpenAI Prompt Generator Snap should be used. Advanced response configurations Truncation: Defines how to handle input that exceeds the model’s context window. auto allows the model to truncate the middle of the conversation to fit, while disabled (default) causes the request to fail with a 400 error if the context limit is exceeded. Include reasoning encrypted content: Includes an encrypted version of reasoning tokens in the output, allowing reasoning items to persist when the store is disabled. Built-in tools Web search: Enables the model to access up-to-date information from the internet to answer queries beyond its training data. Web search type Search context size User location: an approximate user location including city, region, country, and timezone to deliver more relevant search results. File search: Allows the model to retrieve information from documents or files. Vector store IDs Maximum number of results Include search results: Determines whether raw search results are included in the response for transparency or debugging. Ranker Score threshold Filters: Additional metadata-based filters to refine search results. For more details on using filters, see Metadata Filtering. Advanced tool configuration Tool choice: A new option, SPECIFY A BUILT-IN TOOL, allows specifying that the model should use a built-in tool to generate a response. Note that the OpenAI Responses API Generation Snap does not support the response count or stop sequences properties, as these are not available in the Responses API. Additionally, the message user name, which may be specified in the Prompt Generator Snap, is not supported and will be ignored if included. Model response of Chat Completions vs Responses API Chat Completions API Responses API The Responses API introduces an event-driven output structure that significantly enhances how developers build and manage AI-powered applications compared to the traditional Chat Completions API. While the Chat Completions API returns a single, plain-text response within the choices array, the Responses API provides an output array containing a sequence of semantic event items—such as reasoning, message, function_call, web_search_call, and more—that clearly delineate each step in the model's reasoning and actions. This structured approach allows developers to easily track and interpret the model's behavior, facilitating more robust error handling and smoother integration with external tools. Moreover, the response from the Responses API includes the model parameters settings, providing additional context for developers. Pipeline examples Built-in tool: web search This example demonstrates how to use the built-in web search tool. In this pipeline, the user’s location is specified to ensure the web search targets relevant geographic results. System prompt: You are a friendly and helpful assistant. Please use your judge to decide whether to use the appropriate tools or not to answer questions from the user. Prompt: Can you recommend 2 good sushi restaurants near me? Output: As a result, the output contains both a web search call and a message. The model uses the web search to find and provide recommendations based on current data, tailored to the specified location. Built-in tool: File search This example demonstrates how the built-in file search tool enables the model to retrieve information from documents stored in a vector store during response generation. In this case, the file wildfire_stats.pdf has been uploaded. You can create and manage vector stores through the Vector Store management page. Prompt: What is the number of Federal wildfires in 2018 Output: The output array contains a file_search_call event, which includes search results in its results field. These results provide matched text, metadata, and relevance scores from the vector store. This is followed by a message event, where the model uses the retrieved information to generate a grounded response. The presence of detailed results in the file_search_call is enabled by selecting the Include file search results option. OpenAI Responses API Tool Calling The OpenAI Responses API Tool Calling Snap is designed to support function calling using OpenAI’s Responses API. It works similarly to the OpenAI Tool Calling Snap (which uses the Chat Completions API), but is adapted to the event-driven response structure of the Responses API and supports stateful interactions via the previous response ID. While it shares much of its configuration with the Responses API Generation Snap, it is purpose-built for workflows involving function calls. Existing LLM agent pipeline patterns and utility Snaps—such as the Function Generator and Function Result Generator—can continue to be used with this Snap, just as with the original OpenAI Tool Calling Snap. The primary difference lies in adapting the Snap configuration to accommodate the Responses API’s event-driven output, particularly the structured function_call event item in the output array. The Responses API Tool Calling Snap provides two output views, similar to the OpenAI Tool Calling Snap, with enhancements to simplify building agent pipelines and support stateful interactions using the previous response ID: Model response view: The complete API response, including extra fields: messages: an empty list if store is enabled, or the full message history—including messages payload and model response—if disabled (similar to the OpenAI Tool Calling Snap). When using stateful workflows, message history isn’t needed because the previous response ID is used to maintain context. has_tool_call: a boolean indicating whether the response includes a tool call. Since the Responses API no longer includes the finish_reason: "tool_calls" field, this new field makes it easier to create stop conditions in the pipeloop Snap within the agent driver pipeline. Tool call view: Displays the list of function calls made by the model during the interaction. Tool Call View of Chat Completions vs Responses API Uses id as the function call identifier when sending back the function result. Tool call properties (name, arguments) are nested inside the function field. Each tool call includes: • id: the unique event ID • call_id: used to reference the function call when returning the result The tool call structure is flat — name and arguments are top-level fields. Building LLM Agent Pipelines To build LLM agent pipelines with the OpenAI Responses API Tool Calling Snap, you can reuse the same agent pipeline pattern described in Introducing Tool Calling Snaps and LLM Agent Pipelines. Only minor configuration changes are needed to support the Responses API. Agent Driver Pipeline The primary change is in the PipeLoop Snap configuration, where the stop condition should now check the has_tool_call field, since the Responses API no longer includes the finish_reason:"tool_calls". Agent Worker Pipeline Fields mapping A Mapper Snap is used to prepare the related fields for the OpenAI Responses API Tool Calling Snap. OpenAI Responses API Tool Calling The key changes are in this Snap’s configuration to support the Responses API’s stateful interactions. There are two supported approaches: Option 1: Use Store (Recommended) Leverages the built-in state management of the Responses API. Enable Store Use Previous Response ID Send only the function call results as the input messages for the next round. (messages field in the Snap’s output will be an empty array, so you can still use it in the Message Appender Snap to collect tool results.) Option 2: Maintain Conversation History in Pipeline Similar to the approach used in the Chat Completions API. Disable Store Include the full message history in the input (messages field in the Snap’s output contains message history) (Optional) Enable Include Reasoning Encrypted Content (for reasoning models) to preserve reasoning context efficiently OpenAI Function Result Generator As explained in Tool Call View of Chat Completions vs Responses API section, the Responses API includes both an id and a call_id. You must use the call_id to construct the function call result when sending it back to the model. Conclusion The OpenAI Responses API makes AI workflows smarter and more adaptable, with stateful interactions and built-in tools. SnapLogic’s OpenAI Responses API Generation and Tool Calling Snaps bring these capabilities directly into your pipelines, letting you take advantage of advanced features like built-in tools and event-based outputs with only minimal adjustments. By integrating these Snaps, you can seamlessly enhance your workflows and fully unlock the potential of the Responses API.38Views0likes0CommentsOpenAI Responses API
Introduction OpenAI announced the Responses API, their most advanced and versatile interface for building intelligent AI applications. Supporting both text and image inputs with rich text outputs, this API enables dynamic, stateful conversations that remember and build on previous interactions, making AI experiences more natural and context-aware. It also unlocks powerful capabilities through built-in tools such as web search, file search, code interpreter, and more, while enabling seamless integration with external systems via function calling. Its event-driven design delivers clear, structured updates at every step, making it easier than ever to create sophisticated, multi-step AI workflows. Key features include: Stateful conversations via the previous response ID Built-in tools like web search, file search, code interpreter, MCP, and others Access to advanced models available exclusively, such as o1-pro Enhanced support for reasoning models with reasoning summaries and efficient context management through previous response ID or encrypted reasoning items Clear, event-based outputs that simplify integration and control While the Chat Completions API remains fully supported and widely used, OpenAI plans to retire the Assistants API in the first half of 2026. To support the adoption of the Responses API, two new Snaps have been introduced: OpenAI Chat Completions ⇒ OpenAI Responses API Generation OpenAI Tool Calling ⇒ OpenAI Responses API Tool Calling Both Snaps are fully compatible with existing upstream and downstream utility Snaps, including the OpenAI Prompt Generator, OpenAI Multimodal Content Generator, all Function Generators (Multi-Pipeline, OpenAPI, and APIM), the Function Result Generator, and the Message Appender. This allows existing pipelines and familiar development patterns to be reused while gaining access to the advanced features of the Responses API. OpenAI Responses API Generation The OpenAI Responses API Generation Snap is designed to support OpenAI’s newest Responses API, enabling more structured, stateful, and tool-augmented interactions. While it builds upon the familiar interface of the Chat Completions Snap, several new properties and behavioral updates have been introduced to align with the Responses API’s capabilities. New properties Message: The input sent to the LLM. This field replaces the previous Use message payload, Message payload, and Prompt properties in the OpenAI Chat Completions Snap, consolidating them into a single input. It removes ambiguity between "prompt" as raw text and as a template, and supports both string and list formats. Previous response ID: The unique ID of the previous response to the model. Use this to create multi-turn conversations. Model parameters Reasoning summary: For reasoning models, provides a summary of the model’s reasoning process, aiding in debugging and understanding the model's reasoning process. The property can be none, auto, or detailed. Advanced prompt configurations Instructions: Applied only to the current response, making them useful for dynamically swapping instructions between turns. To persist instructions across turns when using previous_response_id, the developer message in the OpenAI Prompt Generator Snap should be used. Advanced response configurations Truncation: Defines how to handle input that exceeds the model’s context window. auto allows the model to truncate the middle of the conversation to fit, while disabled (default) causes the request to fail with a 400 error if the context limit is exceeded. Include reasoning encrypted content: Includes an encrypted version of reasoning tokens in the output, allowing reasoning items to persist when the store is disabled. Built-in tools Web search: Enables the model to access up-to-date information from the internet to answer queries beyond its training data. Web search type Search context size User location: an approximate user location including city, region, country, and timezone to deliver more relevant search results. File search: Allows the model to retrieve information from documents or files. Vector store IDs Maximum number of results Include search results: Determines whether raw search results are included in the response for transparency or debugging. Ranker Score threshold Filters: Additional metadata-based filters to refine search results. For more details on using filters, see Metadata Filtering. Advanced tool configuration Tool choice: A new option, SPECIFY A BUILT-IN TOOL, allows specifying that the model should use a built-in tool to generate a response. Note that the OpenAI Responses API Generation Snap does not support the response count or stop sequences properties, as these are not available in the Responses API. Additionally, the message user name, which may be specified in the Prompt Generator Snap, is not supported and will be ignored if included. Model response of Chat Completions vs Responses API Chat Completions API Responses API The Responses API introduces an event-driven output structure that significantly enhances how developers build and manage AI-powered applications compared to the traditional Chat Completions API. While the Chat Completions API returns a single, plain-text response within the choices array, the Responses API provides an output array containing a sequence of semantic event items—such as reasoning, message, function_call, web_search_call, and more—that clearly delineate each step in the model's reasoning and actions. This structured approach allows developers to easily track and interpret the model's behavior, facilitating more robust error handling and smoother integration with external tools. Moreover, the response from the Responses API includes the model parameters settings, providing additional context for developers. Pipeline examples Built-in tool: web search This example demonstrates how to use the built-in web search tool. In this pipeline, the user’s location is specified to ensure the web search targets relevant geographic results. System prompt: You are a friendly and helpful assistant. Please use your judge to decide whether to use the appropriate tools or not to answer questions from the user. Prompt: Can you recommend 2 good sushi restaurants near me? Output: As a result, the output contains both a web search call and a message. The model uses the web search to find and provide recommendations based on current data, tailored to the specified location. Built-in tool: File search This example demonstrates how the built-in file search tool enables the model to retrieve information from documents stored in a vector store during response generation. In this case, the file wildfire_stats.pdf has been uploaded. You can create and manage vector stores through the Vector Store management page. Prompt: What is the number of Federal wildfires in 2018 Output: The output array contains a file_search_call event, which includes search results in its results field. These results provide matched text, metadata, and relevance scores from the vector store. This is followed by a message event, where the model uses the retrieved information to generate a grounded response. The presence of detailed results in the file_search_call is enabled by selecting the Include file search results option. OpenAI Responses API Tool Calling The OpenAI Responses API Tool Calling Snap is designed to support function calling using OpenAI’s Responses API. It works similarly to the OpenAI Tool Calling Snap (which uses the Chat Completions API), but is adapted to the event-driven response structure of the Responses API and supports stateful interactions via the previous response ID. While it shares much of its configuration with the Responses API Generation Snap, it is purpose-built for workflows involving function calls. Existing LLM agent pipeline patterns and utility Snaps—such as the Function Generator and Function Result Generator—can continue to be used with this Snap, just as with the original OpenAI Tool Calling Snap. The primary difference lies in adapting the Snap configuration to accommodate the Responses API’s event-driven output, particularly the structured function_call event item in the output array. The Responses API Tool Calling Snap provides two output views, similar to the OpenAI Tool Calling Snap, with enhancements to simplify building agent pipelines and support stateful interactions using the previous response ID: Model response view: The complete API response, including extra fields: messages: an empty list if store is enabled, or the full message history—including messages payload and model response—if disabled (similar to the OpenAI Tool Calling Snap). When using stateful workflows, message history isn’t needed because the previous response ID is used to maintain context. has_tool_call: a boolean indicating whether the response includes a tool call. Since the Responses API no longer includes the finish_reason: "tool_calls" field, this new field makes it easier to create stop conditions in the pipeloop Snap within the agent driver pipeline. Tool call view: Displays the list of function calls made by the model during the interaction. Tool Call View of Chat Completions vs Responses API Uses id as the function call identifier when sending back the function result. Tool call properties (name, arguments) are nested inside the function field. Each tool call includes: • id: the unique event ID • call_id: used to reference the function call when returning the result The tool call structure is flat — name and arguments are top-level fields. Building LLM Agent Pipelines To build LLM agent pipelines with the OpenAI Responses API Tool Calling Snap, you can reuse the same agent pipeline pattern described in Introducing Tool Calling Snaps and LLM Agent Pipelines. Only minor configuration changes are needed to support the Responses API. Agent Driver Pipeline The primary change is in the PipeLoop Snap configuration, where the stop condition should now check the has_tool_call field, since the Responses API no longer includes the finish_reason:"tool_calls". Agent Worker Pipeline Fields mapping A Mapper Snap is used to prepare the related fields for the OpenAI Responses API Tool Calling Snap. OpenAI Responses API Tool Calling The key changes are in this Snap’s configuration to support the Responses API’s stateful interactions. There are two supported approaches: Option 1: Use Store (Recommended) Leverages the built-in state management of the Responses API. Enable Store Use Previous Response ID Send only the function call results as the input messages for the next round. (messages field in the Snap’s output will be an empty array, so you can still use it in the Message Appender Snap to collect tool results.) Option 2: Maintain Conversation History in Pipeline Similar to the approach used in the Chat Completions API. Disable Store Include the full message history in the input (messages field in the Snap’s output contains message history) (Optional) Enable Include Reasoning Encrypted Content (for reasoning models) to preserve reasoning context efficiently OpenAI Function Result Generator As explained in Tool Call View of Chat Completions vs Responses API section, the Responses API includes both an id and a call_id. You must use the call_id to construct the function call result when sending it back to the model. Conclusion The OpenAI Responses API makes AI workflows smarter and more adaptable, with stateful interactions and built-in tools. SnapLogic’s OpenAI Responses API Generation and Tool Calling Snaps bring these capabilities directly into your pipelines, letting you take advantage of advanced features like built-in tools and event-based outputs with only minimal adjustments. By integrating these Snaps, you can seamlessly enhance your workflows and fully unlock the potential of the Responses API.72Views0likes0CommentsMore Than Just Fast: A Holistic Guide to High-Performance AI Agents
At SnapLogic, while building and refining an AI Agent for a large customer in the healthcare industry, we embarked on a journey of holistic performance optimization. We didn't just want to make it faster. We tried to make it better across the board. This journey taught us that significant gains are found by looking at the entire system, from the back-end data sources to the pixels on the user's screen. Here’s our playbook for building a truly high-performing AI agent, backed by real-world metrics. The Foundation: Data and Architecture Before you can tune an engine, you have to build it on a solid chassis. For an AI Agent, that chassis is its core architecture and its relationship with data. Choose the Right Brain for the Job: Not all LLMs are created equal. The "best" model depends entirely on the nature of the tasks your agent needs to perform. A simple agent with one or two tools has very different requirements from a complex agent that needs to reason, plan, and execute dynamic operations. Matching the model to the task complexity is key to balancing cost, speed, and capability. Task Complexity Model Type Characteristics & Best For Simple, Single-Tool Tasks Fast & Cost-Effective Goal: Executing a well-defined task with a limited toolset (e.g., simple data lookups, classification). These models are fast and cheap, perfect for high-volume, low-complexity actions. Multi-Tool Orchestration Balanced Goal: Reliably choosing the correct tool from several options and handling moderately complex user requests. These models offer a great blend of speed, cost, and improved instruction-following for a good user experience. Complex Reasoning & Dynamic Tasks High-Performance / Sophisticated Goal: Handling ambiguous requests that require multi-step reasoning, planning, and advanced tool use like dynamic SQL query generation. These are the most powerful (and expensive) models, essential for tasks where deep understanding and accuracy are critical. Deconstruct Complexity with a Multi-Agent Approach: A single, monolithic agent designed to do everything can become slow and unwieldy. A more advanced approach is to break down a highly complex agent into a team of smaller, specialized agents. This strategy offers two powerful benefits: It enables the use of faster, cheaper models. Each specialized agent has a narrower, more defined task, which often means you can use a less powerful (and faster) LLM for that specific job, reserving your most sophisticated model for the "manager" agent that orchestrates the others. It dramatically increases reusability. These smaller, function-specific agents and their underlying tools are modular. They can be easily repurposed and reused in the next AI Agent you build, accelerating future development cycles. Set the Stage for Success with Data: An AI Agent is only as good as the data it can access. We learned that optimizing data access is a critical first step. This involved: Implementing Dynamic Text-to-SQL: Instead of relying on rigid, pre-defined queries, we empowered the agent to build its own SQL queries dynamically from natural language. This flexibility required a deep initial investment in analyzing and understanding the critical columns and data formats our agent would need from sources like Snowflake. Generating Dedicated Database Views: To support the agent, we generated dedicated views on top of our source tables. This strategy serves two key purposes: it dramatically reduces query times by pre-joining and simplifying complex data, and it allows us to remove sensitive or unnecessary data from the source, ensuring the agent only has access to what it needs. Pre-loading the Schema for Agility: Making the database schema available to the agent is critical for accurate dynamic SQL generation. To optimize this, we pre-load the relevant schemas at startup. This simple step saves precious time on every single query the agent generates, contributing significantly to the overall responsiveness. The Engine: Tuning the Agent’s Logic and Retrieval Our Diagnostic Toolkit: Using AI to Analyze AI Before we could optimize the engine, we needed to know exactly where the friction was. Our diagnostic process followed a two-step approach: High-Level Analysis: We started in the SnapLogic Monitor, which provides a high-level, tabular view of all pipeline executions. This dashboard is the starting point for any performance investigation. As you can see below, it gives a list of all runs, their status, and their total duration. By clicking the Download table button, you can export this summary data as a CSV. This allows for a quick, high-level analysis to spot outliers and trends without immediately diving into verbose log files. AI-Powered Deep Dive: Once we identified a bottleneck from the dashboard—a pipeline that was taking longer than expected—we downloaded the detailed, verbose log files for those specific pipeline runs. We then fed these complex logs into an AI tool of our choice. This "AI analyzing AI" approach helped us instantly pinpoint key issues that would have taken hours to find manually. For example, this process uncovered an unnecessary error loop caused by duplicate JDBC driver versions, which significantly extended the execution time of our Snowflake Snaps. Fixing this single issue was a key factor in the 68% performance improvement we saw when querying our technical knowledge base. With a precise diagnosis in hand, we turned our attention to the agent's "thinking" process. This is where we saw some of our most dramatic performance gains. How We Achieved This: Crafting the Perfect Instructions (System Prompts): We transitioned from generic prompts to highly customized system prompts, optimized for both the specific task and the chosen LLM. A simpler model gets a simpler, more direct prompt, while a sophisticated model can be instructed to "think step-by-step" to improve its reasoning. A Simple Switch for Production Speed: One of the most impactful, low-effort optimizations came from how we use a key development tool: the Record Replay Snap. During the creation and testing of our agent's pipelines, this Snap is invaluable for capturing and replaying data, but it adds about 2.5 seconds of overhead to each execution. For a simple agent run involving a driver, a worker, and one tool, this adds up to 7.5 seconds of unnecessary latency in a production environment. Once our pipelines were successfully tested, we switched these Snaps to "Replay Only" mode. This simple change instantly removed the recording overhead, providing a significant speed boost across all agent interactions. Smarter, Faster Data Retrieval (RAG Optimization): For our Retrieval-Augmented Generation (RAG) tools, we focused on two key levers: Finding the Sweet Spot (k value): We tuned the k value—the number of documents retrieved for context. For our product information retrieval use case, adjusting this value was the key to our 63% speed improvement. It’s the art of getting just enough context for an accurate answer without creating unnecessary work for the LLM. Surgical Precision with Metadata: Instead of always performing a broad vector search, we enabled the agent to use metadata. If it knows a document's unique_ID, it can fetch that exact document. This is the difference between browsing a library and using a call number. It's swift and precise. Ensuring Consistency: We set the temperature to a low value during the data extraction and indexing process. This ensures that the data chunks are created consistently, leading to more reliable and repeatable search results. The Results: A Data-Driven Transformation Our optimization efforts led to significant, measurable improvements across several key use cases for the AI Agent. Use Case Before Optimization After Optimization Speed Improvement Querying Technical Knowledge Base 92 seconds 29 seconds ~68% Faster Processing Sales Order Data 32 seconds 10.7 seconds ~66% Faster RAG Retrieval 5.8 seconds 2.1 seconds ~63% Faster Production Optimization (Replay Only) 20 seconds 17.5 seconds ~12% Faster* (*This improvement came from switching development Snaps to a production-ready "Replay Only" mode, removing the latency inherent to the testing phase.) The Experience: Focusing on the User Ultimately, all the back-end optimization in the world is irrelevant if the user experience is poor. The final layer of our strategy was to focus on the front-end application. Engage, Don't Just Wait: A simple "running..." message can cause user anxiety and make any wait feel longer. Our next iteration will provide a real-time status of the agent's thinking process (e.g., "Querying product database...", "Synthesizing answer..."). This transparency keeps the user engaged and builds trust. Guide the User to Success: We learned that a blank text box can be intimidating. By providing predefined example prompts and clearly explaining the agent's capabilities, we guide the user toward successful interactions. Deliver a Clear Result: The final output must be easy to consume. We format our results cleanly, using tables, lists, and clear language to ensure the user can understand and act on the information instantly. By taking this holistic approach, we optimized the foundation, the engine, and the user experience to build an AI Agent that doesn't just feel fast. It feels intelligent, reliable, and genuinely helpful.37Views0likes0CommentsMulti Pipeline Function Generator - Simplifies Agent Worker Pipeline
This article introduces a new Snap called the “Multi Pipeline Function Generator”. The Multi Pipeline Function Generator is designed to take existing Pipelines in your SnapLogic Project and turn their configurations into function definitions for LLM-based tool calling. It achieves the following: It replaces the existing chain of function generators, therefore reduces the length of the worker pipeline. Combined with our updates to the tool calling snaps, this snap allows multiple tool calling branches to be merged into a single branch, simplifying the pipeline structure. With it, users can directly select the desired pipeline to be used as a tool from a dropdown menu. The snap will automatically retrieve the tool name, purpose, and parameters from the pipeline properties to generate a function definition in the required format. Problem Statement Currently, the complexity of the agent worker pipeline increases linearly with the number of tools it has. The image below shows a worker pipeline with three tools. It requires three function generators and has three tool calling branches to execute different tools. This becomes problematic when the number of tools is large, as the pipeline becomes very long both horizontally and vertically. Current Agent Worker Pipeline With Three Tools Solution Overview One Multi Pipeline Function Generator snap can replace multiple function generators (as long as the tool is a pipeline; it's not applicable if the tool is of another type, such as OpenAPI or APIM service). New Agent Worker Pipeline Using “Multi Pipeline Function Generator” Additionally, for each outputted tool definition, it includes the corresponding pipeline's path. This allows downstream components (the Pipeline Execute snap) to directly call the respective tool pipeline with the path, as shown below. The Multi Pipeline Function Generator snap allows users to select multiple tool pipelines at once through dropdown menus. It reads the necessary data for generating function definition from the pipeline properties. Of course, this requires that the data has been set up in the pipeline properties beforehand (will be explained later). The image below shows the settings for this snap. Snap Settings How to Use the Snap To use this snap, you need to: Fill in the necessary information for generating the function definition in the properties of your tool pipeline. The pipeline's name will become the function name The information under 'info -> purpose' will become the function description. Each key in your OpenAPI specification will be treated as a parameter, so you will ALSO need to add the expected input parameters to the list of pipeline parameters. Please note that in the current design, the pipeline parameters specified here are solely used for generating the function definition. When utilizing parameters within the pipeline, you do not need to retrieve their values using pipeline parameters. Instead, you can directly access the argument values from the input document, as determined by the model based on the function definition. Then, you can select this pipeline as a tool from the dropdown menu in the Multi Pipeline Function Generator snap. In the second output of the tool calling snap, we only need to keep one branch. In the pipeline execute snap, we can directly use the expression $sl_tool_metadata.path to dynamically retrieve the path of the tool pipeline being called. See image below. Below is an example of the pipeline properties for the tool 'CRM_insight' for your reference. Below is the settings page of the original function generator snap for comparison. As you can see, the information required is the same. The difference is that now we directly fill this information into the pipeline's properties. Step 3 - reduce the number of branches More Design Details The tool calling snap has also been updated to support $sl_tool_metadata.path , since the model's initial response doesn't include the pipeline path which is needed. After the tool calling snap receives the tools the model needs to call, it adds the sl_tool_metadata containing the pipeline path to the model's response and outputs it to the snap's second output view. This allows us to use it in the pipeline execute snap later. This feature is supported for tool calling with Amazon Bedrock, OpenAI, Azure OpenAI, and Google GenAI snap packs. The pipeline path can accept either a string or a list as input. By turning on the 'Aggregate input' mode, multiple input documents can be combined into a single function definition document for output, similar to that of a gate snap. This can be useful in scenarios like this: you use a SnapLogic list snap to enumerate all pipelines within a project, then use a filter snap to select the desired tool pipelines, and finally use the multi pipeline function generator to convert this series of pipelines into function definitions. Example Pipelines Download here. Conclusion In summary, the Multi Pipeline Function Generator snap streamlines the creation of function definitions for pipeline as tool in agent worker pipelines. This significantly reduces pipeline length in scenarios with numerous tools, and by associating pipeline information directly with the pipeline, it enhances overall manageability. Furthermore, its applicability extends across various providers.686Views0likes1CommentA Comparison of Assistant and Non-Assistant Tool Calling Pipelines
Introduction At a high level, the logic behind assistant tool calling and non-assistant tool calling is fundamentally the same: the model instructs the user to call specific function(s) in order to answer the user's query. The user then executes the function and returns the result to the model, which uses it to generate an answer. This process is identical for both. However, since the assistant specifies the function definitions and access to tools as part of the Assistant configuration within the OpenAI or Azure OpenAI dashboard rather than within your pipelines, there will be major differences in the pipeline configuration. Additionally submitting tool responses to an Assistant comes with significant changes and challenges since the Assistant owns the conversational history rather than the pipeline. This article focuses on contrasting these differences. For a detailed understanding of assistant pipelines and non-assistant pipelines, please refer to the following article: Non-assistant pipelines: Introducing Tool Calling Snaps and LLM Agent Pipelines Assistant pipelines: Introducing Assistant Tool Calling Pipelines Part 1: Which System to Use: Non-Assistant or Assistant? When to Use Non-Assistant Tool Calling Pipelines: Non-Assistant Tool Calling Pipelines offer greater flexibility and control over the tool calling process, making them suitable for the following specific scenarios. When preferring a “run-time“ approach: Non-Assistant pipelines exhibit greater flexibility in function definition, offering a more "runtime" approach. You can dynamically adjust the available functions by simply adding or removing Function Generator snaps within the pipeline. In contrast, Assistant Tool Calling Pipelines necessitate a "design-time" approach. All available functions must be pre-defined within the Assistant configuration, requiring modifications to the Assistant definition in the OpenAI/Azure OpenAI dashboard. When wanting detailed chat history: Non-Assistant pipelines provide a comprehensive history of the interaction between the model and the tools in the output message list. The message list within the Non-Assistant pipeline preserves every model response and the results of each function execution. This detailed logging allows for thorough debugging, analysis, and auditing of the tool calling process. In contrast, Assistant pipelines maintain a more concise message history, focusing on key steps and omitting some intermediate details. While this can simplify the overall view of the message list, it can also make it more difficult to trace the exact sequence of events or diagnose issues that may arise during tool execution in child pipelines. When needing easier debugging and iterative development: Non-Assistant pipelines facilitate more granular debugging and iterative development. You can easily simulate individual steps of the agent by making calls to the model with specific function call histories. This allows for more precise control and experimentation during development, enabling you to isolate and address issues more effectively. For example, by providing three messages, we can "force" the model to call the second tool, allowing us to inspect the tool calling process and its result against our expectations. In contrast, debugging and iterating with Assistant pipelines can be more cumbersome. Since Assistants manage the conversation history internally, to simulate a specific step, you often need to replay the entire interaction from the beginning, potentially requiring multiple iterations to reach the desired state. This internal management of history makes it less straightforward to isolate and debug specific parts of the interaction. To simulate calling the third tool, we need to start a new thread from scratch and then call tool1 and tool2, repeating the preceding process. The current thread cannot be reused. When to Use Assistant Tool Calling Pipelines: Assistant Tool Calling Pipelines also offer a streamlined approach to integrating LLMs with external tools, prioritizing ease of use and built-in functionalities. Consider using Assistant pipelines in the following situations: For simplified pipeline design: Assistant pipelines reduce pipeline complexity by eliminating the need for Tool Generator snaps. In Non-Assistant pipelines, these snaps are essential for dynamically generating tool definitions within the pipeline itself. With Assistant pipelines, tool definitions are configured beforehand within the Assistant settings in the OpenAI/Azure OpenAI dashboard. This pre-configuration results in shorter, more manageable pipelines, simplifying development and maintenance. When leveraging built-in tools is required: If your use case requires functionalities like searching external files or executing code, Assistant pipelines offer these capabilities out-of-the-box through their built-in File Search and Code Interpreter tools (see Part 5 for more details). These tools provide a convenient and efficient way to extend the LLM's capabilities without requiring custom implementation within the pipeline. Part 2: A brief introduction to two pipelines Non-assistant tool calling pipelines Key points: Functions are defined in the worker. The worker pipeline's Tool Calling snap manages all model interactions. Function results are collected and sent to the model in the next iteration via the Tool Calling snap. Assistant tool calling pipelines Key points: No need to define functions in any pipeline. Functions are pre-defined in the assistant. Two snaps : interact with the model: Create and Run Thread, and Submit Tool Outputs. Function results are collected and sent to the model immediately during the current iteration. Part 3: Comparison between two pipelines Here are two primary reasons why the assistant and non-assistant pipelines differ, listed in decreasing order of importance: Distinct methods of submitting tool results: For non-assistant pipelines, tool results are appended to the message history list and subsequently forwarded to the model during the next iteration. Non-assistant pipelines exhibit a "while-loop" behavior, where the worker interacts with the model at the beginning of the iteration, and while any tools need to be called, the worker executes those tool(s). In contrast, for assistants, tool results are specifically sent to a dedicated endpoint designed to handle tool call results within the current iteration. The assistant pipelines operate more like a "do-while-loop." The driver initiates the interaction by sending the prompt to the model. Subsequently, the worker execute the tool(s) first and interacts with the model at the end of the iteration to deliver tool results. Predefined and stored tool definitions for assistants: Unlike non-assistant pipelines, assistants have the capability to predefine and store function definitions. This eliminates the need for the three Function Generator snaps to repeatedly transmit tool definitions to the model with each request. Consequently, the worker pipeline for assistants appears shorter. Due to the aforementioned differences, non-assistant pipelines have only one interaction point with the model, located in the worker. In contrast, assistant pipelines involve two interaction points: the driver sends the initial prompt to the model, while the worker sends tool results back to the model. Part 4: Differences in snap settings Stop condition of Pipeloop A key difference in snap settings lies in the stop condition of the pipeloop. Assistant pipeline’s stop condition: $run.required_action == null . Non-assistant pipeline’s stop condition: $finish_reason != "tool_calls" . Assistant’s output Example when tool calls are required: Example when tool calls are NOT required: Non-assistant’s output Example when tool calls are required: Example when tool calls are NOT required: Part 5: Assistant’s two built-in tools The assistant not only supports all functions that can be defined in non-assistant pipelines but also provides two special built-in functions, file search and code interpreter, for user convenience. If the model determines that either of these tools is required, it will automatically call and execute the tool within the assistant without requiring manual user intervention. You don't need a tool call pipeline to experiment with file search and code interpreter. A simple create and run thread snap is sufficient. File search File Search augments the Assistant with knowledge from outside its model, such as proprietary product information or documents provided by your users. OpenAI automatically parses and chunks your documents, creates and stores the embeddings, and use both vector and keyword search to retrieve relevant content to answer user queries. Example Prompt: What is the number of federal fires between 2018 and 2022? The assistant’s response is as below: The assistant’s response is correct. As the answer to the prompt is in the first row of a table on the first page of wildfire_stats.pdf, a document accessible to the assistant via a vector store. Answer to the prompt: The file is stored in a vector store used by the assistant: Code Interpreter Code Interpreter allows Assistants to write and run Python code in a sandboxed execution environment. This tool can process files with diverse data and formatting, and generate files with data and images of graphs. Code Interpreter allows your Assistant to run code iteratively to solve challenging code and math problems. When your Assistant writes code that fails to run, it can iterate on this code by attempting to run different code until the code execution succeeds. Example Prompt: Find the number of federal fires between 2018 and 2022 and use Matplotlib to draw a line chart. * Matplotlib is a python library for creating plots. The assistant’s response is as below: From the response, we can see that the assistant indicated it used file search to find 5 years of data and then generated an image file. This file can be downloaded from the assistant's dashboard under storage-files. Simply add a file extension like .png to see the image. Image file generated by assistant: Part 6: Key Differences Summarized Feature Non-Assistant Tool Calling Pipelines Assistant Tool Calling Pipelines Function Definition Defined within the worker pipeline using Function Generator snaps. Pre-defined and stored within the Assistant configuration in the OpenAI/Azure OpenAI dashboard. Tool Result Submission Appended to the message history and sent to the model in the next iteration. Sent to a dedicated endpoint within the current iteration. Model Interaction Points One (in the worker pipeline). Two (driver sends initial prompt, worker sends tool results). Built-in Tools None. File Search and Code Interpreter. Pipeline Complexity More complex pipeline structure due to function definition within the pipeline. Simpler pipeline structure as functions are defined externally.804Views4likes0CommentsIntroducing Tool Calling Snaps and LLM Agent Pipelines
Introduction In this article, we will be introducing the following. Part 1: Four new classes of snaps for LLM function calling: Function Generator, Tool Calling, Function Result Generator, and Message Appender, which have been developed specifically for tool calling. Part 2: The Function Calling pipeline to demonstrate how the new Function calling snaps work together to perform LLM function calling. Part 3: Using PipeLoop snap to orchestrate Agent pipelines: iteratively call the Function Calling pipeline until the model generates a final result or meets other termination conditions to perform agentic workflows. Part 1: Introducing 4 new snap classes for tool calling Function Generator Snap: create a function definition. Tool Calling Snap: sends tool calling request to the model to retrieve LLM model response. Function Result Snap: formats the result of tool run to be sent back to the LLM. Message Appender Snap: append the tool results to the messages payload array. Function Generator Snap The Function Generator Snap facilitates the creation of a Tool definition, enabling the model to understand and utilize the available tools. Sample Output: Tool Calling Snap The Tool Calling Snap forwards user input and tool specifications to the model's API, receiving the model's generated output in return. This snap has 2 output views: The first view outputs the full response from the model the complete message payload, including the model's current response The second view outputs the list of tools to call In the OpenAI and Azure OpenAI Tool Calling Snap, a JSON argument field is added by SnapLogic, whose value is a JSON object derived from converting the string-formatted argument of the model's response tool call. Sample Input: Sample Output - LLM Response View: Sample Output - Tool Calls View: Function Result Generator Snap The Function Result Generator Snap formats the results generated by user-invoked functions into a custom data output structure defined within SnapLogic. Different models have different requirements for the data type of the Content field. For example, Bedrock Converse requires Content to be a string or a json , OpenAI requires Content to be string only. The Snap will stringify the content from the user if the format in the field is not supported. Sample Input: Sample Output: Message Appender Snap The message appender snap adds the results of tool runs to the message list, serving as input for subsequent tool calls. Sample Input - First Input View - Messages Sample Input - Second Input View - Tool Result Sample Output By leveraging the four new Snaps, we will be able to create pipelines that are capable of LLM function calling, which we will refer as Function Calling Pipelines. Part 2: Function Calling Pipeline Example This pipeline demonstrates how to use the new snaps to perform LLM function calling. Setup We will be using the following four snaps for LLM Function calling: Function Generator Snap Tool Calling Snap Function Result Generator Snap Message Appender Snap The function calling pipeline incorporates two tools as pipelines: get_current_weather (Using PipeExec) This pipeline retrieves weather information for a given location. Pipeline setup An HTTP Client that connects to the weatherapi endpoint A mapper that passes the JSON output to the content field foo_tool A toy tool that outputs “foo“ as the result, used to demonstrate multi-tool calling capabilities. Pipeline setup A mapper that outputs “foo” in the content output Execution Flow The execution flow of this pipeline follows the following steps: The user provides the prompt (wrapped in a messages payload) in a JSON Generator Snap, creates tool definitions using the Function Generator Snap, which is then sent to the LLM through the Tool Calling Snap. The Chat completions view of the Tool Calling Snap outputs the response from the LLM and adds the current response from the LLM into the messages payload, and is connected to the first input of the Message Appender Snap for processing, the Tool calls view is connected to a router to pass tool calls to the individual tools. The tools are invoked, then results are formatted by the Function Result Generator Snap The Message Appender Snap collects and appends all tool invocation results to the messages array from the Chat completions view output from the Tool Calling Snap and outputs the modified messages array. The output of the Message Appender contains the message history from the User prompt, LLM model respones, and the tool calling output, which marks the end of this round of tool calling. Part 3: Agent Pipelines To orchestrate LLM function calling pipelines or LLM Agent Pipelines, we introduce two patterns as pipelines to enable this functionality. Agent Driver Pipeline The Agent Driver Pipeline Leverages PipeLoop Snap to allow iterative executions on a single pipeline. The prompt input is defined then sent to the Agent Worker Pipeline (a Function Calling pipeline). The output of the Function Calling pipeline is then collected and sent again as the second iteration input of the Function Calling pipeline, the iteration will continue until the stop condition set in PipeLoop is reached or when the iteration limit is reached. Agent Worker Pipeline The Agent Worker Pipeline is similar to a Function Calling pipeline, the only difference is that the message payload is sent from the Agent Driver Pipeline through PipeLoop Snap instead of a JSON Generator snap. Agent Pipeline Example - get_weather This example demonstrates a weather checking assistant. This agent is equipped with a single tool - get_weather, which retrieves the weather data of a given location. Agent Driver Pipeline In this example, the user will provide a payload like below, which is to ask about the weather of a given location. (Which is mocked using a JSON generator Snap) { "prompt": "What's the weather in San Francisco?" } The system prompt for this weather assistant is then defined in the first Prompt Generator "You are a helpful weather assistant that will answer questions about the weather of a given location. You will be assigned with a tool to check the weather of the location." The user prompt for this case is simply the prompt payload from the user, which we will pass to the Agent Worker Pipeline through the PipeLoop Snap. We will stop the PipeLoop Execution when the finish reason of the LLM is stop or end_turn (depending on the LLM model) Agent Worker Pipeline In the Agent Worker Pipeline, the flow follows the following steps First Iteration: Create function definitions for the tools to be called. In this case, the get_weather function. Pass the message payload (system and user prompts), and tools payload (function definitions) to the Tool Calling Snap. The Tool Calling Snap will then decide to either call a tool or generate a result. In the first case, it will return a tool call decision for the pipeline to process. [ { "toolUse": { "toolUseId": "tooluse_YOLmGccxRGWPmCKqxAKvgw", "name": "get_current_weather", "input": { "location": "San Francisco, CA", "unit": "fahrenheit" } } } ] The Worker pipeline will then diverge into two branches. The first branch will pass the messages payload of this round to the Message Appender Snap, and the second branch will pass the tool call request to the tool to invoke a call and get the weather of San Francisco. The result of the tool call will be collected and formatted by the Function Result Generator Snap, then passed to the Message Appender Snap so that the the Tool Call result can be added into the Message Payload. For this round, the finish reason of the LLM is tool_use , which means the execution should continue, and the output of the Message Appender will be sent directly to the input of the Agent Worker Pipeline. Message Appender Output Second Iteration: The updated message payload is then sent again with the function definitions to the Tool Calling Snap, the Tool Calling Snap for this round will then generate a result since it has retrieved the weather of San Francisco. The Tool Call output of the Tool Calling Snap will be empty for this round since no tool calls are required for this iteration. The message payload is sent directly to the Message Appender Snap, and the finish reason of the LLM is end_turn , which means the LLM has successfully carried out the request. PipeLoop execution will stop and the result will be sent to the output of the PipeLoop Snap in the Agent Driver Pipeline. And the execution is finished. Summary In this article, we have introduced the new Snaps for Tool calling - Function Generator, Tool Calling, Function Result Generator, and Message Appender. We have also talked about how to create tool-calling pipelines and Agent Pipeline patterns. Happy building!1.2KViews2likes0CommentsIntroducing Assistant Tool Calling Pipelines
Introduction OpenAI and Azure OpenAI assistants can invoke models and utilize tools to accomplish tasks. This article primarily focuses on constructing pipelines to leverage the tool-calling capabilities of an existing assistant. Given the substantial similarity in assistant tool calling between OpenAI and Azure versions, the examples provided in this article are applicable to both platforms. In part 1, we'll provide a simple introduction to creating an assistant in OpenAI Dashboard and adding user-defined tools for subsequent pipeline use. We'll provide all the necessary data and files. In part 2, we'll demonstrate two questions and their corresponding assistant responses to illustrate the types of tools the assistant can call, or requires users to call, upon to answer queries. In part 3, we’ll introduce two new snaps: tool call router and submit tool outputs , along with upgrades to the existing two snaps: run thread and create and run thread . In part 4, we'll delve into the pipeline workflow and the specific configurations required for setting up snaps. Part 1: Prerequisite - Set Up An Assistant in OpenAI Dashboard OpenAI and Azure OpenAI assistants manage the system prompt, the model used to generate response, tools (including file search, code interpreter, and other user-defined tools), and model configuration such as temperature and response format. Here we will only introduce the most basic settings, and you can adjust them according to your needs. Please refer to OpenAI and Azure OpenAI documentations for more information. Navigate to the OpenAI Dashboard: Go to the OpenAI dashboard - assistants and click the " Create " button in the top right corner to initiate the process of creating a new assistant. Name Your Assistant: Provide a name for your new assistant. You can choose any name you prefer, such as " Test Assistant ". System Instruction (Optional): You can optionally provide a system instruction to guide the assistant's behavior. For now, let's skip this step. Select a Model: Choose the model you want to use for your assistant. In this case, we'll select " gpt-4o-mini ". Enable Tools: Enable the " file search " File search is an OpenAI-provided managed RAG service. Using this tool allows the model to retrieve information relevant to the query from the vector store and use it to answer. In this case, please create a new vector store, upload the wildfire_stats.pdf file to the vector store, and add the vector store to the assistant. Enable the" code interpreter " tools The code interpreter is also a built-in tool within the OpenAI assistant. It can run the code produced by the model directly and provide the output. Create three custom functions with the following schema: By providing these definitions, we are enabling the model to identify which user-defined functions it can call. While the model can suggest the necessary function, the responsibility of executing the function lies with the user. Function definition: get_weather Function definition: get_wiki_url Function definition: get_webpage In this way, we've successfully created the assistant we'll be using. It should look similar to the image below. Now you can directly go to the playground and ask some questions to see how the assistant responds. Up to this point, you should have created an assistant with three user-defined functions. The file search tool should have access to a vector store that contains a file. Part 2: Two Examples of Assistant Tool Calling To help you understand how the assistant works, we will use the following pipeline to ask the newly created assistants two questions in this section and examine their responses. You can find the construction details for this pipeline in part 4. For now, let's focus on the pipeline's execution results. Pipeline Overview The Driver Pipeline The Worker Pipeline Prompt One Our first question to the assistant is: "What is the weather and the wiki url of San Francisco? And what is the content of the wiki page?" Through this query, we're evaluating the assistant's capability to: 1) identify the necessary tools for a task - in this case, all three: get_weather, get_wiki_url, and get_webpage should be called; 2) understand the sequential dependencies between tools. For example, the assistant should recognize that get_wiki_url must be called before get_webpage to acquire the necessary URL. As shown below, the model's response is both reasonable and correct. Prompt Two Our second question to the assistant is: What is the number of federal fires from 2018 to 2022, and can you write a Python code to sort the years based on the number of fires in ascending order and tell me the weather in San Francisco? The question might seem a bit odd on its own, but our goal is to evaluate how the assistant handles built-in tools such as file search and code interpreter. Specifically, we want to determine if it can effectively combine these built-in tools with user-defined functions in providing an answer. To answer this question, the model needs to first invoke the file search tool to retrieve the first row of data from the first table on the first page of the Wildfire PDF. Then, it generates a Python code snippet for sorting and calls the second tool, the code interpreter, to execute this code. Finally, it calls the third user-defined tool, get_weather, to obtain the weather in San Francisco. Expected Data in Wildfire PDF: As shown below, the model responses as expected. Up to this point, you should understand that the assistant could utilize three different categories of tools to answer user questions. Part 3: Introduction of New Snaps We'll start by focusing on the new elements of the pipeline: two newly introduced snaps and the added attributes to the existing ones, before delving into the overall pipeline details. 1. Tool Call Router (new) The tool call router snap simplifies the assistant's response (the run object) for easier downstream processing. It combines the functionalities of copy , mapper , and JSON splitter . The first output view contains: the original assistant's response an empty list named tool_outputs to collect the results of all function executions in the subsequent message appender snap. The second output view provides a list of tools to call, extracted from the required actions section of the assistant's response 2. Submit Tool Outputs (new) This snap submits a list of function execution results to the assistant. The assistant will then generate the final response or request further tool calls. 3. Create and Run Thread (upgraded) We've added a new section to the Create And Run Thread configuration to specify detailed parameters for tool calls. The Tool choice option allows you to instruct the assistant to: automatically select tools ( AUTO ) use no tools ( NONE ) require at least one tool ( REQUIRED ) use a specific user-defined tool ( SPECIFY A FUNCTION , providing the function name ). The Parallel tool call option determines whether the assistant can call multiple tools simultaneously. 4. Run Thread (upgraded) Same configuration is added to the Run Thread snap as well. Part 4: Hands-on Pipeline Construction Pipeline workflow overview There are a total of 5 pipelines. Driver pipeline : Sends the initial prompt to the assistant. Receives a response containing tool call requests. Passes the response to the "pipeloop" snap to trigger the worker pipelines to execute the tools. Worker pipeline: Executes the function calls specified in the tool call requests. Collects the results of the function calls. Sends the results back to the assistant. This pipeline is executed repeatedly until there are no more tools to call. get_weather pipeline: Takes a city name as input. Queries a weather API to get the current weather for the specified city. Outputs the retrieved weather information. get_wiki_url pipeline: Takes a city name as input. Searches for the Wikipedia page URL for the specified city. Outputs the found URL. get_webpage pipeline: Takes a URL as input Fetch the webpage by visiting the URL Use a model to summarize the content of the webpage Outputs the summary The Driver Pipeline The driver pipeline can be constructed in two ways: either using a combined "create and run" operation or by performing the creation and running steps sequentially. Both methods achieve the same result in this scenario. The Worker Pipeline The get_weather Pipeline You can get a free API key by signing up on Free Weather API - WeatherAPI.com. The get_wiki_url Pipeline The get_webpage Pipeline Get Client: Access the webpage pointed to by the URL and retrieve the HTML content. HTML Parser: Parse the HTML content into text format. Summarize: Generate a user prompt and concatenate it with the webpage text. OpenAI Summarize: Use the model to generate a summary of the webpage content. Input and output of key snaps We'll illustrate the essential inputs and outputs of the intermediate process through a single tool call interaction. 1. Create and Run Thread This snap forwards the user's initial prompt to the assistant and returns a run object. The highlight of this run object is the required action , which outlines the necessary tool calls. Output of Create and Run Thread - a run object 2. Tool Call Router It's important to note that the first output view not only holds the assistant's response but also an empty "tool_outputs" list. This list serves as a container for storing function results as they are gathered in subsequent message appenders. Tool Call Router - 1st output view The second output view extracts the tool calls from the required actions and converts the argument values into JSON format, storing them in json_arguments . This eliminates the need for subsequent argument conversion by each tool. Tool Call Router - 2nd output view 3. Pipeline Execute Snap - Get Weather Function Get Weather Function - Input Get Weather Function - Output The tool's output provides a full HTTP response, however, we're solely interested in the "entity" content which will serve as the tool's output. This extraction will occur in the subsequent snap, "Function Result Generator". 4. Pipeline Execute Snap - Get Wiki URL Function Get Wiki URL Function - Input Get Wiki URL Function - Output The tool's output provides a full HTTP response, however, we're solely interested in the "entity" content which will serve as the tool's output. This extraction will occur in the subsequent snap, "Function Result Generator". 5. Message Appender The Message Appender’s output contains a run object from upstream, however, we're solely interested in the tool_outputs field which is a list of function results. Thus in the subsequent snap, "Submit Tool Outputs", we will only use the tool_outputs field. Message Appender - Output 6. Submit Tool Outputs This snap forwards function results to the assistant and receives a run object as a response. This object can either provide the final answer or dictate subsequent tool calls. In this example, the assistant's output specifies the next tool to be called, as indicated by the "required action". Submit Tool Outputs - Output - subsequent tool calls example In the following example, the assistant outputs the final result. There's an extra message list in the output which contains the result itself as well as the original user prompt. Submit Tool Outputs - Output - final answer example Snap settings This article particularly emphasizes the loop condition settings in the pipeloop . We've configured the loop to terminate when the assistant's response indicates no further tool calls are required (i.e., " required_action " is null). This is because if there's no need for additional tool calls, there's no reason to continue executing the worker using Pipeloop. Edge Case - When no tool call is needed The previous driver pipeline had a limitation: it couldn't handle cases where the model could directly answer the user's query without calling any user-defined functions. This was because the output of Create and Run Thread wouldn't contain the required_action field. Since the pipeloop snap follows a do-while logic, it would always run at least once before checking the stop condition. Consequently, when the assistant didn't require a tool call, submitting the tool call output to the assistant in the worker pipeline would result in an error. The following driver pipeline offers a simple solution to this problem by using a router to bypass the pipeloop for requests that can be answered directly.951Views2likes0CommentsLLM response logging for analytics
Why do we need LLM Observability? GenAI applications are great, they answer like how a human does. But how do you know if GPT isn’t being “too creative” to you when results from the LLM shows “Company finances are facing issues due to insufficient sun coverage”? As the scope of GenAI apps broaden, the vulnerability expands, and since LLM outputs are non-deterministic, a setup that once worked isn’t guaranteed to always work. Here’s an example of comparing the reasons why an LLM prompt fails vs why a RAG application fails. What could go wrong in the configuration? LLM prompts Suboptimal model parameters Temperature too high / tokens too small Uninformative System prompts RAG Indexing The data wasn’t chunked with the right size, information is sparse yet the window is small. Wrong distance was used. Used Euclidean distance instead of cosine Dimension was too small / too large Retrieval Top K too big, too much irrelevant context fetched Top K too small, not enough relevant context to generate result Filter misused And everything in LLM Prompts Although observability does not magically solve all problems, it gives us a good chance to figure out what might have gone wrong. LLM Observability provides methodologies to help developers better understand LLM applications, model performances, biases, and can help resolve issues before they reach the end users. What are common issues and how observability helps? Observability helps understanding in many ways, from performance bottlenecks to error detection, security and debugging. Here’s a list of common questions we might ask ourselves and how observability may come in handy. How long does it take to generate an answer? Monitor LLM response times and database query times helps identify potential bottlenecks of the application. Is the context retrieved from the Vector Database relevant? Logging database query and results retrieved helps identify better performing queries. Can assist on chunk size configuration based on retrieved results. How many tokens are used in a call? Monitor token usage can help determine the cost of each LLM call. How much better/worse is my new configuration setup doing? Parameter monitoring and response logging helps compare the performance of different models and model configurations. How is the GenAI application performing overall? Tracing stages of the application and evaluation helps identify the performance of the application What are users asking? Logging and analyzing user prompts help understand user needs and can help evaluate if optimizations can be introduced to reduce costs. Helps identify security vulnerabilities by monitoring malicious attempts and help proactively respond to mitigate threats. What should be tracked? GenAI applications involve components chained together. Depending on the use case, there are events and input/output parameters that we want to capture and analyze. A list of components to consider: Vector Database metadata Vector dimension: The vector dimension used to in the vector database Distance function: The way two vectors are compared in the vector database Vector Indexing parameters Chunk configuration: How a chunk is configured, including the size of the chunk, the unit of chunks, etc. This affects information density in a chunk. Vector Query parameters Query: The query used to retrieve context from the Vector Database Top K: The maximum number of vectors to retrieve from the Vector Database Prompt templates System prompt: The prompt to be used throughout the application Prompt Template: The template used to construct a prompt. Prompts work differently in different models and LLM providers LLM request metadata Prompt: The input sent to the LLM model from each end-user, combined with the template Model name: The LLM model used for generation, which affects the capability of the application Tokens: The number of tokens limit for a single request Temperature: The parameter for setting the creativity and randomness of the model Top P: The range of selection of words, the smaller the value the narrower the word selection is sampled from. LLM response metadata Tokens: The number of tokens used in input and output generation, affects costs Request details: May include information such as guardrails, id of the request, etc. Execution Metrics Execution time: Time taken to process individual requests Pipeline examples Logging a Chat completions pipeline We're using MongoDB to store model parameters and LLM responses as JSON documents for easy processing. Logging a RAG pipeline In this case, we're storing parameters to the RAG system (Agent Retrieve in this case) and the model. We're using JSON Generator Snaps to parameterize all input parameters to the RAG system and the LLM models. We then concat the response from the Vector Database, LLM model, and the parameters we provided for the requests.1.3KViews3likes1CommentWhat is Retrieval-Augmented Generation (RAG)?
What is Retrieval-Augmented Generation (RAG)? Retrieval-Augmented Generation (RAG) is the process of enhancing the reference data used by language models (LLMs) through integrating them with traditional information retrieval systems. This hybrid approach allows LLMs to access and utilize external knowledge bases, databases, and other authoritative sources of information, thereby improving the accuracy, relevance, and currency of the generated responses without requiring extensive retraining. Without RAG, LLMs generate responses based on the information they were trained on. With RAG, the response generation process is enriched by integrating external information into the generation. How does Retrieval-Augmented Generation work? Retrieval-Augmented Generation works through bringing multiple systems or services to generate the prompt to the LLM. This means there will be required setup to support the different systems and services to feed the appropriate data for a RAG workflow. This involves several key steps: 1. External Data Source Creation: External data refers to information outside the original training data of the LLM. This data can come from a variety of sources such as APIs, databases, document repositories, and web pages. The data is pre-processed and converted into numerical representations (embeddings) using embedding models, and then stored in a searchable vector database along with reference to the data that was used to generate the embedding. This forms a knowledge library that can be used to augment a prompt when calling into the LLM for generation of a response to a given input. 2. Retrieval of Relevant Information: When a user inputs a query, it is embedded into a vector representation and matched against the entries in the vector database. The vector database retrieves the most relevant documents or data based on semantic similarity. For example, a query about company leave policies would retrieve both the general leave policy document and the specific role leave policies. 3. Augmentation of LLM Prompt: The retrieved information is then integrated into the prompt to send to the LLM using prompt engineering techniques. This fully formed prompt is sent to the LLM, providing additional context and relevant data that enables the model to generate more accurate and contextually appropriate responses. 4. Generation of Response: The LLM processes the augmented prompt and generates a response that is coherent, contextually appropriate, and enriched with accurate, up-to-date information. The following diagram illustrates the flow of data when using RAG with LLMs. Why use Retrieval-Augmented Generation? RAG addresses several inherent challenges of using LLMs by leveraging external data sources: 1. Enhanced Accuracy and Relevance: By accessing up-to-date and authoritative information, RAG ensures that the generated responses are accurate, specific, and relevant to the user's query. This is particularly important for applications requiring precise and current information, such as specific company details, release dates and release items, new features available for a product, individual product details, etc.. 2. Cost-Effective Implementation: RAG enables organizations to enhance the performance of LLMs without the need for expensive and time-consuming fine-tuning or custom model training. By incorporating external knowledge libraries, RAG provides a more efficient way to update and expand the model's basis of knowledge. 3. Improved User Trust: With RAG, responses can include citations or references to the original sources of information, increasing transparency and trust. Users can verify the source of the information, which enhances the credibility and trust of an AI system. 4. Greater Developer Control: Developers can easily update and manage the external knowledge sources used by the LLM, allowing for flexible adaptation to changing requirements or specific domain needs. This control includes the ability to restrict sensitive information retrieval and ensure the correctness of generated responses. Doing this in conjunction with an evaluation framework (link to evaluation pipeline article) can help to roll out newer content more rapidly to downstream consumers. Snaplogic GenAI App Builder: Building RAG with Ease Snaplogic GenAI App Builder empowers business users to create large language model (LLM) powered solutions without requiring any coding skills. This tool provides the fastest path to developing generative enterprise applications by leveraging services from industry leaders such as OpenAI, Azure OpenAI, Amazon Bedrock, Anthropic Claude on AWS, and Google Gemini. Users can effortlessly create LLM applications and workflows using this robust platform. With Snaplogic GenAI App Builder, you can construct both an indexing pipeline and a Retrieval-Augmented Generation (RAG) pipeline with minimal effort. Indexing Pipeline This pipeline is designed to store the contents of a PDF file into a knowledge library, making the content readily accessible for future use. Snaps used: File Reader, PDF Parser, Chunker, Amazon Titan Embedder, Mapper, OpenSearch Upsert. After running this pipeline, we would be able to view these vectors in OpenSearch. RAG Pipeline This pipeline enables the creation of a chatbot capable of answering questions based on the information stored in the knowledge library. Snap used: HTTP Router, Amazon Titan Embedder, Mapper, OpenSearch Query, Amazon Bedrock Prompt Generator, Anthropic Claude on AWS Messages. To implement these pipelines, the solution utilizes the Amazon Bedrock Snap Pack and the OpenSearch Snap Pack. However, users have the flexibility to employ other LLM and vector database Snaps to achieve similar functionality.1.3KViews4likes0Comments