Introduction to PipeLoop
We all love the Pipeline Execute Snap, it greatly simplifies a complex pipeline by extracting sections into a sub-pipeline. But sometimes, we’d really want the ability to run a pipeline multiple times to perform some operations, like polling from an endpoint or performing LLM Tool calls. In this article, we will introduce the PipeLoop Snap, which adds iteration to the SnapLogic programming model. With PipeLoop, we can create new workflows that are previously hard to manage or even impossible. What is PipeLoop PipeLoop is a new Snap for iterative execution on a pipeline. For people who are familiar with iterations within programming languages, PipeLoop is essentially a do-while loop for pipelines. The user is required to provide an iteration limit as a hard cutoff to avoid resource depletion or infinite loop, and an optional stop condition to control the execution. Just like we can pass input documents to PipeExec, we can also pass input documents to PipeLoop, the difference between the two is that the output document of the pipeline executed with PipeLoop will be used as the next round of input to continue the execution until the stop condition is met or limit is reached. Due to this unique mechanism, the pipeline run by PipeLoop must have one unlinked input and one unlinked output to work properly. To put it simply, PipeLoop can be thought of as chaining a bunch of PipeExec Snaps with the same pipeline with variable length and a condition to exit early. PipeLoop execution flow 1. Input documents to PipeLoop are passed to the child pipeline for execution. 2. Child pipeline executes. 3. Child output is collected. 4. Evaluate stop condition based on document output. If true, exit and pass the output document to PipeLoop, otherwise continue. 5. Check if the iteration limit is reached. If true, exit and pass the output document to PipeLoop, otherwise continue. 6. Use the output document as the next round of input and continue (1.) PipeLoop execution walkthrough Let’s start with a very simple example. We’ll create a workflow using PipeLoop that increments a number from 1 to 3. For simplicity, we will refer to the pipeline with PipeLoop as the “Parent pipeline”, and the pipeline that is executed by PipeLoop as the “Child pipeline”. Parent pipeline setup The parent pipeline consists of one JSON Generator Snap with one document as input, and one PipeLoop Snap running the pipeline “child” with stop condition “$num >= 3”. We’ll also enable “Debug Iteration output” to see the output of each round in this walkthrough. Child pipeline setup The child pipeline consists of a single mapper snap that increments “$num” by 1, which satisfies the requirement “a pipeline with one unlinked input and one unlinked output” for a pipeline to be run by PipeLoop. Output The output of PipeLoop consists of two major sections when Debug mode is enabled: the output fields, and _iteration_documents. We can see the final output is “num”: 3, which means PipeLoop has successfully carried out the task. PipeLoop features There are multiple features in PipeLoop that can be helpful when building iterating pipelines. We’ll categorize them from where the features are located. Properties There are 4 main sections in the property of the PipeLoop Snap. Pipeline Pipeline Parameters Loop options Execution Options Pipeline The pipeline to be run. Pipeline Parameters We’ll take a deeper dive into this in the Pipeline Parameters section. Loop options Loop options are property settings that are related to iterations of this snap. Stop condition The Stop condition field allows the user to set an expression to be evaluated after the first execution has occurred. If the expression is evaluated to true, the iteration will be stopped. The stop condition can be also set to false if the user wishes to use this as a traditional for loop. There are cases where the user might pass an unintended value into the Stop condition field. In this scenario, PipeLoop generates a warning when the user provides a non-boolean String as the Stop condition, while the stop condition will be treated as false. Non-boolean Stop condition warning Iteration limit The Iteration limit field allows the user to limit the maximum number of iterations that could potentially occur. This field can also be used to limit the total number of executions if the Stop condition is set to false. Setting a large value for the Iteration limit with debug mode on could be dangerous. The accumulated documents could quickly deplete CPU and RAM resources. To prevent this, PipeLoop generates a warning in the Pipeline Validation Statistics tab when the Iteration limit is set to greater than or equal to 1000 with Debug mode set to enabled. Large iteration limit with debug mode enabled warning Debug iteration outputs This toggle field enables the output from the child pipelines for each iteration and the stop condition evaluation to be added into the final output as a separate field. Output example with Debug iteration outputs enabled Execution options Execute On To specify where the pipeline execution should take place. Currently only local executions (local snaplex, local node) are supported. Execution Label We’ll take a deeper dive into this in the Monitoring section. Pipeline Parameters For users that are familiar with Pipeline Parameters in PipeExec, feel free to skip to the next section as the instructions are identical. Introduction to Pipeline Parameters Before we take a look at the Pipeline Parameters support in the PipeLoop Snap, let’s take a step back and see what pipeline parameters are and how pipeline parameters can be leveraged. Pipeline parameters are String constants that can be defined in the Edit Pipeline Configuration settings. Users can use the parameters as a constant to be used anywhere in the pipeline. One major difference for Pipeline parameters and Pipeline variables is that Pipeline parameters are referred using an underscore prefix, whereas Pipeline variables are referred using a dollar sign prefix. Pipeline Parameters in Edit Pipeline Configuration Accessing Pipeline Parameters in an expression field Example Let’s take a look at Pipeline Parameters in action with PipeLoop. Our target here is to print out “Hello PipeLoop!” n times where n is the value of “num”. We’ll add two parameters in the child pipeline, param1 and param2. To demonstrate, we assign “value1” to param1 and keep it empty for param2. We’ll then add a message field with the value “Hello PipeLoop!” in the JSON Generator so that we can assign the String value to param2. Now we’re able to use param2 as a constant in the child pipeline. PipeLoop also has field name suggestions built in the Parameter name fields for ease of use. PipeLoop Pipeline Parameters in action For our child pipeline, we’ll add a new row in the Mapping table to print out “Hello PipeLoop!” repeatedly (followed with a new line character). One thing to bear in mind is that the order of the Mapping table does not affect the output (the number of “Hello PipeLoop!” printed in this case), as the output fields are updated after the execution of current iteration is finished. Child Pipeline configuration for our task Here’s the final result, we can see “Hello PipeLoop!” is being printed twice. Mission complete. Remarks Pipeline Parameters are String constants that can be set in Edit Pipeline Configuration. Users can pass a String to Pipeline Parameters defined in the Child pipeline in PipeLoop. Pipeline Parameters in PipeLoop will override previous pipeline parameter values defined in the Child pipeline if the parameters share the same name. Pipeline Parameters are constants, which means the values will not be modified during iterations even if the users did so. Monitoring When a snap in a pipeline is executed, there will not be any output until the execution is finished. Therefore, due to the nature of iterating pipeline execution as a single snap, it is slightly difficult to know where the execution is currently at, or which pipeline execution is corresponding to which input document. To deal with this, we have two extra features that can add more visibility to the PipeLoop execution. Pipeline Statistics progress bar During the execution of PipeLoop, a progress bar will be available in the Pipeline Validation Statistics tab, so that the user can get an idea of which iteration the PipeLoop is currently at. Note that the progress bar might not reflect the actual iteration index if the child pipeline executions are short, due to polling intervals. PipeLoop iteration progress bar Execution Label When a PipeLoop with multiple input documents is executed, the user will not be able to tell which pipeline execution is linked to which input document in the SnapLogic Monitor. Execution label is the answer to this problem. The user can pass in a value in the Execution label field that can differentiate input documents so that each input document will have its own label in the Snaplogic Monitor during Execution. Here’s an example of two input documents running on the child pipeline. We set the Execution label with the expression “child_label” + $num, so the execution for the first document will have the label “child_label0” and the second execution will have the label “child_label1”. Execution label settings SnapLogic Monitor View Summary In this article, we introduced PipeLoop, a new Snap for iterative execution workflows. The pipeline run by PipeLoop must have one unlinked input and one unlinked output. PipeLoop has the following features: Pipeline Parameters support Stop condition to exit early with warnings Iteration limit to avoid infinite loop with warnings Debug mode Execution label to differentiate runs in Monitor Progress bar for status tracking Happy Building!2KViews5likes0CommentsEmbeddings and Vector Databases
What are embeddings Embeddings are numerical representations of real-world objects, like text, images or audio. They are generated by machine learning models as vectors, an array of numbers, where the distance between vectors can be seens as the degree of similarity between objects. While an embedding model may have its own meaning for each of the dimensions, there’s no guarantee between embedding models of the meaning for each of the dimensions used by the embedding models. For example, the word “cat”, “dog” and “apple” might be embedded into the following vectors: cat -> (1, -1, 2) dog -> (1.5, -1.5, 1.8) apple -> (-1, 2, 0) These vectors are made-up for a simpler example. Real vectors are much larger, see the Dimension section for details. Visualizing these vectors as points in a 3D space, we can see that "cat" and "dog" are closer, while "apple" is positioned further away. Figure 1. Vectors as points in a 3D space By embedding words and contexts into vectors, we enable systems to assess how related two embedded items are to each other via vector comparison. Dimension of embeddings The dimension of embeddings refers to the length of the vector representing the object. In the previous example, we embedded each word into a 3-dimensional vector. However, a 3-dimensional embedding inevitably leads to a massive loss of information. In reality, word embeddings typically require hundreds or thousands of dimensions to capture the nuances of language. For example, OpenAI's text-embedding-ada-002 model outputs a 1536-dimensional vector Google Gemini's text-embedding-004 model outputs a 768-dimensional vector Amazon Titan's amazon.titan-embed-text-v2:0 model outputs a default 1024-dimensional vector Figure 2. Using text-embedding-ada-002 to embed the sentence “I have a calico cat.” In short, an embedding is a vector that represents a real-world object. The distance between these vectors indicates the similarity between the objects. Limitation of embedding models Embedding models are subject to a crucial limitation: the token limit, where a token can be a word, punctuation mark, or subword part. This constraint defines the maximum amount of text a model can process in a single input. For instance, the Amazon Titan Text Embeddings models can handle up to 8,192 tokens. When input text exceeds the limit, the model typically truncates it, discarding the remaining information. This can lead to a loss of context and diminished embedding quality, as crucial details might be omitted. To address this, several strategies can help mitigate its impact: Text Summarization or Chunking: Long texts can be summarized or divided into smaller, manageable chunks before embedding. Model Selection: Different embedding models have varying token limits. Choosing a model with a higher limit can accommodate longer inputs. What is a Vector Database Vector databases are optimized for storing embeddings, enabling fast retrieval and similarity search. By calculating the similarity between the query vector and the other vectors in the database, the system returns the vectors with the highest similarity, indicating the most relevant content. The following diagram illustrates a vector database search. A query vector 'favorite sport' is compared to a set of stored vectors, each representing a text phrase. The nearest neighbor, 'I like football', is returned as the top result. Figure 3. Vector Query Example Figure 4. Store Vectors into Database Figure 5. Retrieve Vectors from Database When working with vector databases, two key parameters come into play: Top K and similarity measure (or distance function). Top K When querying a vector database, the goal is often to retrieve the most similar items to a given query vector. This is where the Top K concept comes into play. Top K refers to retrieving the top K most similar items based on a similarity metric. For instance, if you're building a product recommendation system, you might want to find the top 10 products similar to the one a user is currently viewing. In this case, K would be 10. The vector database would return the 10 product vectors closest to the query product's vector. Similarity Measures To determine the similarity between vectors, various distance metrics are employed, including: Cosine Similarity: This measures the cosine of the angle between two vectors. It is often used for text-based applications as it captures semantic similarity well. A value closer to 1 indicates higher similarity. Euclidean Distance: This calculates the straight-line distance between two points in Euclidean space. It is sensitive to magnitude differences between vectors. Manhattan Distance: Also known as L1 distance, it calculates the sum of the absolute differences between corresponding elements of two vectors. It is less sensitive to outliers compared to Euclidean distance. Figure 6. Similarity Measures There are many other similarity measures not listed here. The choice of distance metric depends on the specific application and the nature of the data. It is recommended to experiment with various similarity metrics to see which one produces better results. What embedders are supported in SnapLogic As of October 2024, SnapLogic has supported embedders for major models and continues to expand its support. Supported embedders include: Amazon Titan Embedder OpenAI Embedder Azure OpenAi Embedder Google Gemini Embedder What vector databases are supported in SnapLogic Pinecone OpenSearch MongoDB Snowflake Postgres AlloyDB Pipeline examples Embed a text file Read the file using the File Reader snap. Convert the binary input to a document format using the Binary to Document snap, as all embedders require document input. Embed the document using your chosen embedder snap. Figure 7. Embed a File Figure 8. Output of the Embedder Snap Store a Vector Utilize the JSON Generator snap to simulate a document as input, containing the original text to be stored in the vector database. Vectorize the original text using the embedder snap. Employ a mapper snap to format the structure into the format required by Pinecone - the vector field is named "values", and the original text and other relevant data are placed in the "metadata" field. Store the data in the vector database using the vector database's upsert/insert snap. Figure 9. Store a Vector into Database Figure 10. A Vector in the Pinecone Database Retrieve Vectors Utilize the JSON Generator snap to simulate the text to be queried. Vectorize the original text using the embedder snap. Employ a mapper snap to format the structure into the format required by Pinecone, naming the query vector as "vector". Retrieve the top 1 vector, which is the nearest neighbor. Figure 11. Retrieve Vectors from a Database [ { "content" : "favorite sport" } ] Figure 12. Query Text Figure 13. All Vectors in the Database { "matches": [ { "id": "db873b4d-81d9-421c-9718-5a2c2bd9e720", "score": 0.547461033, "values": [], "metadata": { "content": "I like football." } } ] } Figure 14. Pipeline Output: the Closest Neighbor to the Query Embedder and vector databases are widely used in applications such as Retrieval Augmented Generation (RAG) and building chat assistants. Multimodal Embeddings While the focus thus far has been on text embeddings, the concept extends beyond words and sentences. Multimodal embeddings represent a powerful advancement, enabling the representation of various data types, such as images, audio, and video, within a unified vector space. By projecting different modalities into a shared semantic space, complex relationships and interactions between these data types can be explored. For instance, an image of a cat and the word "cat" might be positioned closely together in a multimodal embedding space, reflecting their semantic similarity. This capability opens up a vast array of possibilities, including image search with text queries, video content understanding, and advanced recommendation systems that consider multiple data modalities.3.2KViews5likes0CommentsA Comparison of Assistant and Non-Assistant Tool Calling Pipelines
Introduction At a high level, the logic behind assistant tool calling and non-assistant tool calling is fundamentally the same: the model instructs the user to call specific function(s) in order to answer the user's query. The user then executes the function and returns the result to the model, which uses it to generate an answer. This process is identical for both. However, since the assistant specifies the function definitions and access to tools as part of the Assistant configuration within the OpenAI or Azure OpenAI dashboard rather than within your pipelines, there will be major differences in the pipeline configuration. Additionally submitting tool responses to an Assistant comes with significant changes and challenges since the Assistant owns the conversational history rather than the pipeline. This article focuses on contrasting these differences. For a detailed understanding of assistant pipelines and non-assistant pipelines, please refer to the following article: Non-assistant pipelines: Introducing Tool Calling Snaps and LLM Agent Pipelines Assistant pipelines: Introducing Assistant Tool Calling Pipelines Part 1: Which System to Use: Non-Assistant or Assistant? When to Use Non-Assistant Tool Calling Pipelines: Non-Assistant Tool Calling Pipelines offer greater flexibility and control over the tool calling process, making them suitable for the following specific scenarios. When preferring a “run-time“ approach: Non-Assistant pipelines exhibit greater flexibility in function definition, offering a more "runtime" approach. You can dynamically adjust the available functions by simply adding or removing Function Generator snaps within the pipeline. In contrast, Assistant Tool Calling Pipelines necessitate a "design-time" approach. All available functions must be pre-defined within the Assistant configuration, requiring modifications to the Assistant definition in the OpenAI/Azure OpenAI dashboard. When wanting detailed chat history: Non-Assistant pipelines provide a comprehensive history of the interaction between the model and the tools in the output message list. The message list within the Non-Assistant pipeline preserves every model response and the results of each function execution. This detailed logging allows for thorough debugging, analysis, and auditing of the tool calling process. In contrast, Assistant pipelines maintain a more concise message history, focusing on key steps and omitting some intermediate details. While this can simplify the overall view of the message list, it can also make it more difficult to trace the exact sequence of events or diagnose issues that may arise during tool execution in child pipelines. When needing easier debugging and iterative development: Non-Assistant pipelines facilitate more granular debugging and iterative development. You can easily simulate individual steps of the agent by making calls to the model with specific function call histories. This allows for more precise control and experimentation during development, enabling you to isolate and address issues more effectively. For example, by providing three messages, we can "force" the model to call the second tool, allowing us to inspect the tool calling process and its result against our expectations. In contrast, debugging and iterating with Assistant pipelines can be more cumbersome. Since Assistants manage the conversation history internally, to simulate a specific step, you often need to replay the entire interaction from the beginning, potentially requiring multiple iterations to reach the desired state. This internal management of history makes it less straightforward to isolate and debug specific parts of the interaction. To simulate calling the third tool, we need to start a new thread from scratch and then call tool1 and tool2, repeating the preceding process. The current thread cannot be reused. When to Use Assistant Tool Calling Pipelines: Assistant Tool Calling Pipelines also offer a streamlined approach to integrating LLMs with external tools, prioritizing ease of use and built-in functionalities. Consider using Assistant pipelines in the following situations: For simplified pipeline design: Assistant pipelines reduce pipeline complexity by eliminating the need for Tool Generator snaps. In Non-Assistant pipelines, these snaps are essential for dynamically generating tool definitions within the pipeline itself. With Assistant pipelines, tool definitions are configured beforehand within the Assistant settings in the OpenAI/Azure OpenAI dashboard. This pre-configuration results in shorter, more manageable pipelines, simplifying development and maintenance. When leveraging built-in tools is required: If your use case requires functionalities like searching external files or executing code, Assistant pipelines offer these capabilities out-of-the-box through their built-in File Search and Code Interpreter tools (see Part 5 for more details). These tools provide a convenient and efficient way to extend the LLM's capabilities without requiring custom implementation within the pipeline. Part 2: A brief introduction to two pipelines Non-assistant tool calling pipelines Key points: Functions are defined in the worker. The worker pipeline's Tool Calling snap manages all model interactions. Function results are collected and sent to the model in the next iteration via the Tool Calling snap. Assistant tool calling pipelines Key points: No need to define functions in any pipeline. Functions are pre-defined in the assistant. Two snaps : interact with the model: Create and Run Thread, and Submit Tool Outputs. Function results are collected and sent to the model immediately during the current iteration. Part 3: Comparison between two pipelines Here are two primary reasons why the assistant and non-assistant pipelines differ, listed in decreasing order of importance: Distinct methods of submitting tool results: For non-assistant pipelines, tool results are appended to the message history list and subsequently forwarded to the model during the next iteration. Non-assistant pipelines exhibit a "while-loop" behavior, where the worker interacts with the model at the beginning of the iteration, and while any tools need to be called, the worker executes those tool(s). In contrast, for assistants, tool results are specifically sent to a dedicated endpoint designed to handle tool call results within the current iteration. The assistant pipelines operate more like a "do-while-loop." The driver initiates the interaction by sending the prompt to the model. Subsequently, the worker execute the tool(s) first and interacts with the model at the end of the iteration to deliver tool results. Predefined and stored tool definitions for assistants: Unlike non-assistant pipelines, assistants have the capability to predefine and store function definitions. This eliminates the need for the three Function Generator snaps to repeatedly transmit tool definitions to the model with each request. Consequently, the worker pipeline for assistants appears shorter. Due to the aforementioned differences, non-assistant pipelines have only one interaction point with the model, located in the worker. In contrast, assistant pipelines involve two interaction points: the driver sends the initial prompt to the model, while the worker sends tool results back to the model. Part 4: Differences in snap settings Stop condition of Pipeloop A key difference in snap settings lies in the stop condition of the pipeloop. Assistant pipeline’s stop condition: $run.required_action == null . Non-assistant pipeline’s stop condition: $finish_reason != "tool_calls" . Assistant’s output Example when tool calls are required: Example when tool calls are NOT required: Non-assistant’s output Example when tool calls are required: Example when tool calls are NOT required: Part 5: Assistant’s two built-in tools The assistant not only supports all functions that can be defined in non-assistant pipelines but also provides two special built-in functions, file search and code interpreter, for user convenience. If the model determines that either of these tools is required, it will automatically call and execute the tool within the assistant without requiring manual user intervention. You don't need a tool call pipeline to experiment with file search and code interpreter. A simple create and run thread snap is sufficient. File search File Search augments the Assistant with knowledge from outside its model, such as proprietary product information or documents provided by your users. OpenAI automatically parses and chunks your documents, creates and stores the embeddings, and use both vector and keyword search to retrieve relevant content to answer user queries. Example Prompt: What is the number of federal fires between 2018 and 2022? The assistant’s response is as below: The assistant’s response is correct. As the answer to the prompt is in the first row of a table on the first page of wildfire_stats.pdf, a document accessible to the assistant via a vector store. Answer to the prompt: The file is stored in a vector store used by the assistant: Code Interpreter Code Interpreter allows Assistants to write and run Python code in a sandboxed execution environment. This tool can process files with diverse data and formatting, and generate files with data and images of graphs. Code Interpreter allows your Assistant to run code iteratively to solve challenging code and math problems. When your Assistant writes code that fails to run, it can iterate on this code by attempting to run different code until the code execution succeeds. Example Prompt: Find the number of federal fires between 2018 and 2022 and use Matplotlib to draw a line chart. * Matplotlib is a python library for creating plots. The assistant’s response is as below: From the response, we can see that the assistant indicated it used file search to find 5 years of data and then generated an image file. This file can be downloaded from the assistant's dashboard under storage-files. Simply add a file extension like .png to see the image. Image file generated by assistant: Part 6: Key Differences Summarized Feature Non-Assistant Tool Calling Pipelines Assistant Tool Calling Pipelines Function Definition Defined within the worker pipeline using Function Generator snaps. Pre-defined and stored within the Assistant configuration in the OpenAI/Azure OpenAI dashboard. Tool Result Submission Appended to the message history and sent to the model in the next iteration. Sent to a dedicated endpoint within the current iteration. Model Interaction Points One (in the worker pipeline). Two (driver sends initial prompt, worker sends tool results). Built-in Tools None. File Search and Code Interpreter. Pipeline Complexity More complex pipeline structure due to function definition within the pipeline. Simpler pipeline structure as functions are defined externally.802Views4likes0CommentsMultimodal Processing in LLM
Multimodal processing in Generative AI represents a transformative leap in how AI systems extract and synthesize information from multiple data types—such as text, images, audio, and video—simultaneously. Unlike traditional single-modality AI models, which focus on one type of data, Multimodal systems integrate and process diverse data streams in parallel, creating a holistic understanding of complex scenarios. This integrated approach is critical for applications that require not just isolated insights from one modality, but a coherent synthesis across different data sources, leading to outputs that are contextually richer and more accurate. Generative AI, with multimodal processing, is redefining text extraction, surpassing traditional OCR by interpreting text within its visual and contextual environment. Unlike OCR, which only converts images to text, generative AI analyzes the surrounding image context, layout, and meaning, enhancing accuracy and depth. For instance, in complex documents, it can differentiate between headings, body text, and annotations, structuring information more intelligently. Additionally, it excels in low-quality or multilingual texts, making it invaluable in industries requiring precision and nuanced interpretation. In video analysis, a generative AI equipped with Multimodal processing can simultaneously interpret the visual elements of a scene, the audio (such as dialogue or background sounds), and any associated text (like subtitles or metadata). This allows the AI to produce a description or summary of the scene that is far more nuanced than what could be achieved by analyzing the video or audio alone. The interplay between these modalities ensures that the generated description reflects not only the visual and auditory content but also the deeper context and meaning derived from their combination. In tasks such as image captioning, Multimodal AI systems go beyond simply recognizing objects in a photo. They can interpret the semantic relationship between the image and accompanying text, enhancing the relevance and specificity of the generated captions. This capability is particularly useful in fields where the context provided by one modality significantly influences the interpretation of another, such as in journalism, where images and written reports must align meaningfully, or in education, where visual aids are integrated with instructional text. Multimodal processing enables AI to synthesize medical images (such as X-rays or MRIs) with patient history, clinical notes, and even live doctor-patient interactions in highly specialized applications like medical diagnostics. This comprehensive analysis allows the AI to provide more accurate diagnoses and treatment recommendations, addressing the complex interplay of symptoms, historical data, and visual diagnostics. Similarly, in customer service, Multimodal AI systems can improve communication quality by analyzing the textual content of a customer's inquiry and the tone and sentiment of their voice, leading to more empathetic and effective responses. Beyond individual use cases, Multimodal processing plays a crucial role in improving the learning and generalization capabilities of AI models. By training on a broader spectrum of data types, AI systems develop more robust, flexible models that can adapt to a wider variety of tasks and scenarios. This is especially important in real-world environments where data is often heterogeneous and requires cross-modal understanding to interpret fully. As Multimodal processing technologies continue to advance, they promise to unlock new capabilities across diverse sectors. In entertainment, Multimodal AI could enhance interactive media experiences by seamlessly integrating voice, visuals, and narrative elements. In education, it could revolutionize personalized learning by adapting content delivery to different sensory inputs. In healthcare, the fusion of Multimodal data could lead to breakthroughs in precision medicine. Ultimately, the ability to understand and generate contextually rich, Multimodal content positions Generative AI as a cornerstone technology in the next wave of AI-driven innovation. Multimodal Content Generator Snap The Multimodal Content Generator Snap encodes file or document inputs into the Snap's multimodal content format, preparing it for seamless integration. The output from this Snap must be connected to the Prompt Generator Snap to complete and format the message payload for further processing. This streamlined setup enables efficient multimodal content handling within the Snap ecosystem. The Snap Properties Type - Select the type of multimodal content. Content Type - Define the specific content type for data transmitted to the LLM. Content - Specify the content path to the multimodal content data for processing. Document Name - Name the document for reference and identification purposes. Aggregate Input - Enable this option to combine all inputs into a single content. Encode Base64 - Enable this option to convert the text input into Base64 encoding. Note: The Content property appears only if the input view is of the document type. The value assigned to Content must be in Base64 format for document inputs, while Snap will automatically use binary as content for binary input types. The Document Name can be set specifically for multimodal document types. The Encode Base64 property encodes text input into Base64 by default. If unchecked, the content will be passed through without encoding. Designing a Multimodal Prompt Workflow In this process, we will integrate multiple Snaps to create a seamless workflow for multimodal content generation and prompt delivery. By connecting the Multimodal Content Generator Snap to the Prompt Generator Snap, we configure it to handle multimodal content. The finalized message payload will then be sent to Claude by Anthropic Claude on AWS Messages. Steps: 1. Add the File Reader Snap: Drag and drop the File Reader Snap onto the designer canvas. Configure the File Reader Snap by accessing its settings panel, then select a file containing images (e.g., a PDF file). Download the sample image files at the bottom of this post if you have not already. Sample image file (Japan_flowers.jpg) 2. Add the Multimodal Content Generator Snap: Drag and drop the Multimodal Content Generator Snap onto the designer and connect it to the File Reader Snap. Open its settings panel, select the file type, and specify the appropriate content type. Here's a refined description of the output attributes from the Multimodal Content Generator: sl_content: Contains the actual content encoded in Base64 format. sl_contentType: Indicates the content type of the data. This is either selected from the configuration or, if the input is a binary, it extracts the contentType from the binary header. sl_type: Specifies the content type as defined in the Snap settings; in this case, it will display "image." 3. Add the Prompt Generator Snap: Add the Prompt Generator Snap to the designer and link it to the Multimodal Content Generator Snap. In the settings panel, enable the Advanced Prompt Output checkbox and configure the Content property to use the input from the Multimodal Content Generator Snap. Click “Edit Prompt” and input your instructions 4. Add and Configure the LLM Snap: Add the Anthropic Claude on AWS Message API Snap as the LLM. Connect this Snap to the Prompt Generator Snap. In the settings, select a model that supports multimodal content. Enable the Use Message Payload checkbox and input the message payload in the Message Payload field. 5. Verify the Result: Review the output from the LLM Snap to ensure the multimodal content has been processed correctly. Validate that the generated response aligns with the expected content and format requirements. If adjustments are needed, revisit the settings in previous Snaps to refine the configuration. Multimodal Models for Advanced Data Extraction Multimodal models are redefining data extraction by advancing beyond traditional OCR capabilities. Unlike OCR, which primarily converts images to text, these models directly analyze and interpret content within PDFs and images, capturing complex contextual information such as layout, formatting, and semantic relationships that OCR alone cannot achieve. By understanding both textual and visual structures, multimodal AI can manage intricate documents, including tables, forms, and embedded graphics, without requiring separate OCR processes. This approach not only enhances accuracy but also optimizes workflows by reducing dependency on traditional OCR tools. In today’s data-rich environment, information is often presented in varied formats, making the ability to analyze and derive insights from diverse data sources essential. Imagine managing a collection of invoices saved as PDFs or photos from scanners and smartphones, where a streamlined approach is needed to interpret their contents. Multimodal large language models (LLMs) excel in these scenarios, enabling seamless extraction of information across file types. These models support tasks such as automatically identifying key details, generating comprehensive summaries, and analyzing trends within invoices whether from scanned documents or images. Here’s a step-by-step guide to implementing this functionality within SnapLogic. Sample invoice files (download the files at the bottom of this post if you have not already) Invoice1.pdf Invoice2.pdf Invoice3.jpeg (Sometimes, the invoice image might be tilted) Upload the invoice files Open Manager page and go to your project that will be used to store the pipelines and related files Click the + (plus) sign and select File The Upload File dialog pops up. Click “Choose Files” to select all the invoice files both PDF and image formats (download the sample invoice files at the bottom of this post if you have not already) Click Upload button and the uploaded files will be shown. Building the pipeline Add the JSON Generator Snap: Drag and drop the JSON Generator onto the designer canvas. Click on the Snap to open settings, then click the "Edit JSON" button Highlight all the text from the template and delete it. Paste all invoice filenames in the format below. The editor should look like this. Click "OK" in the lower-right corner to save the prompt Save the settings and close the Snap Add the File Reader Snap: Drag and drop the File Reader Snap onto the designer canvas Click the Snap to open the configuration panel. Connect the Snap to the JSON Generator Snap by following these steps: Select Views tab Click plus(+) button on the Input pane to add the input view(input0) Save the configuration The Snap on the canvas will have the input view. Connecting it to the JSON Generator Snap In the configuration panel, select the Settings tab Set the File field by enabling expression by clicking the equal sign in front of the text input and set it to $filename to read all the files we specified in the JSON Generator Snap Validate the pipeline to see the File Reader output. Fields that will be used in the Multimodal Content Generator Snap Content-type shows file content type Content-location shows the file path and it will be used in the document name Add the Multimodal Content Generator Snap: Drag and drop the Multimodal Content Generator Snap onto the designer canvas and connect to the File Reader Snap Click the Snap to open the settings panel and configure the following fields: Type: enable the expression set the value to $['content-location'].endsWith('.pdf') ? 'document' : 'image' Document name enable the expression set the value to $['content-location'].snakeCase() Use the snake-case version of the file path as the document name to identify each file and make it compatible with the Amazon Bedrock Converse API. In snake case, words are lowercase and separated by underscores(_). Aggregate input check the checkbox Use this option to combine all input files into a single document. The settings should now look like the following Validate the pipeline to see the Multimodal Content Generator Snap output. The preview output should look like the below image. The sl_type will be document for the pdf file and image for the image file and the name will be the simplified file path. Add the Prompt Generator Snap: Drag and drop the Prompt Generator Snap onto the designer canvas and connect to the Multimodal Content Generator Snap Click the Snap to open the settings panel and configure the following fields: Enable the Advanced Prompt Output checkbox Set the Content to $content to use the content input from the Multimodal Content Generator Snap Click “Edit Prompt” and input your instructions. For example, Based on the total quantity across all invoices, which product has the highest and lowest purchase quantities, and in which invoices are these details found? Add and Configure the LLM Snap: Add the Amazon Bedrock Converse API Snap as the LLM Connect this Snap to the Prompt Generator Snap Click the Snap to open the configuration panel Select the Account tab and select your account Select the Settings tab Select a model that supports multimodal content. Enable the Use Message Payload checkbox Set the Message Payload to $messages to use the message from the Prompt Generator Snap Verify the result: Validate the pipeline and open the preview of the Amazon Bedrock Converse API Snap. The result should look like the following: In this example, the LLM successfully processes invoices in both PDF and image formats, demonstrating its ability to handle diverse inputs in a single workflow. By extracting and analyzing data across these formats, the LLM provides accurate responses and insights, showcasing the efficiency and flexibility of multimodal processing. You can adjust the queries in the Prompt Generator Snap to explore different results.1.7KViews4likes0CommentsWhat is Retrieval-Augmented Generation (RAG)?
What is Retrieval-Augmented Generation (RAG)? Retrieval-Augmented Generation (RAG) is the process of enhancing the reference data used by language models (LLMs) through integrating them with traditional information retrieval systems. This hybrid approach allows LLMs to access and utilize external knowledge bases, databases, and other authoritative sources of information, thereby improving the accuracy, relevance, and currency of the generated responses without requiring extensive retraining. Without RAG, LLMs generate responses based on the information they were trained on. With RAG, the response generation process is enriched by integrating external information into the generation. How does Retrieval-Augmented Generation work? Retrieval-Augmented Generation works through bringing multiple systems or services to generate the prompt to the LLM. This means there will be required setup to support the different systems and services to feed the appropriate data for a RAG workflow. This involves several key steps: 1. External Data Source Creation: External data refers to information outside the original training data of the LLM. This data can come from a variety of sources such as APIs, databases, document repositories, and web pages. The data is pre-processed and converted into numerical representations (embeddings) using embedding models, and then stored in a searchable vector database along with reference to the data that was used to generate the embedding. This forms a knowledge library that can be used to augment a prompt when calling into the LLM for generation of a response to a given input. 2. Retrieval of Relevant Information: When a user inputs a query, it is embedded into a vector representation and matched against the entries in the vector database. The vector database retrieves the most relevant documents or data based on semantic similarity. For example, a query about company leave policies would retrieve both the general leave policy document and the specific role leave policies. 3. Augmentation of LLM Prompt: The retrieved information is then integrated into the prompt to send to the LLM using prompt engineering techniques. This fully formed prompt is sent to the LLM, providing additional context and relevant data that enables the model to generate more accurate and contextually appropriate responses. 4. Generation of Response: The LLM processes the augmented prompt and generates a response that is coherent, contextually appropriate, and enriched with accurate, up-to-date information. The following diagram illustrates the flow of data when using RAG with LLMs. Why use Retrieval-Augmented Generation? RAG addresses several inherent challenges of using LLMs by leveraging external data sources: 1. Enhanced Accuracy and Relevance: By accessing up-to-date and authoritative information, RAG ensures that the generated responses are accurate, specific, and relevant to the user's query. This is particularly important for applications requiring precise and current information, such as specific company details, release dates and release items, new features available for a product, individual product details, etc.. 2. Cost-Effective Implementation: RAG enables organizations to enhance the performance of LLMs without the need for expensive and time-consuming fine-tuning or custom model training. By incorporating external knowledge libraries, RAG provides a more efficient way to update and expand the model's basis of knowledge. 3. Improved User Trust: With RAG, responses can include citations or references to the original sources of information, increasing transparency and trust. Users can verify the source of the information, which enhances the credibility and trust of an AI system. 4. Greater Developer Control: Developers can easily update and manage the external knowledge sources used by the LLM, allowing for flexible adaptation to changing requirements or specific domain needs. This control includes the ability to restrict sensitive information retrieval and ensure the correctness of generated responses. Doing this in conjunction with an evaluation framework (link to evaluation pipeline article) can help to roll out newer content more rapidly to downstream consumers. Snaplogic GenAI App Builder: Building RAG with Ease Snaplogic GenAI App Builder empowers business users to create large language model (LLM) powered solutions without requiring any coding skills. This tool provides the fastest path to developing generative enterprise applications by leveraging services from industry leaders such as OpenAI, Azure OpenAI, Amazon Bedrock, Anthropic Claude on AWS, and Google Gemini. Users can effortlessly create LLM applications and workflows using this robust platform. With Snaplogic GenAI App Builder, you can construct both an indexing pipeline and a Retrieval-Augmented Generation (RAG) pipeline with minimal effort. Indexing Pipeline This pipeline is designed to store the contents of a PDF file into a knowledge library, making the content readily accessible for future use. Snaps used: File Reader, PDF Parser, Chunker, Amazon Titan Embedder, Mapper, OpenSearch Upsert. After running this pipeline, we would be able to view these vectors in OpenSearch. RAG Pipeline This pipeline enables the creation of a chatbot capable of answering questions based on the information stored in the knowledge library. Snap used: HTTP Router, Amazon Titan Embedder, Mapper, OpenSearch Query, Amazon Bedrock Prompt Generator, Anthropic Claude on AWS Messages. To implement these pipelines, the solution utilizes the Amazon Bedrock Snap Pack and the OpenSearch Snap Pack. However, users have the flexibility to employ other LLM and vector database Snaps to achieve similar functionality.1.3KViews4likes0CommentsGuide for Advanced GenAI App Patterns
In the rapidly evolving field of Generative AI (GenAI), foundational knowledge can take you far, but it's the mastery of advanced patterns that truly empowers you to build sophisticated, scalable, and efficient applications. As the complexity of AI-driven tasks grows, so does the need for robust strategies that can handle diverse scenarios—from maintaining context in multi-turn conversations to dynamically generating content based on user inputs. This guide delves into these advanced patterns, offering a deep dive into the strategies that can elevate your GenAI applications. Whether you're an admin seeking to optimize your AI systems or a developer aiming to push the boundaries of what's possible, understanding and implementing these patterns will enable you to manage and solve complex challenges with confidence. 1. Advanced Prompt Engineering 1.1 Comprehensive Control of Response Format In GenAI applications, controlling the output format is crucial for ensuring that responses align with specific user requirements. Advanced prompt engineering allows you to craft prompts that provide precise instructions on how the AI should structure its output. This approach not only improves the consistency of responses but also makes them more aligned with the desired objectives. For instance, you can design prompts with a detailed structure that includes multiple elements such as Context, Objective, Style, Audience, and desired Response Length. This method allows for granular control over the output. A sample prompt might look like this: Context: Provide background information on the topic to set the stage. Objective: Clearly define the purpose of the response. Style: Specify whether the response should be formal, informal, technical, or creative. Audience: Identify the target audience, which influences the language and depth of explanation. Response format: Instruct the AI to generate a response that takes approximately 3 minutes to read, ensuring depth and comprehensiveness, typically spanning 4-5 paragraphs. This level of detail in prompt engineering ensures that the AI-generated content meets specific needs, making it suitable for various use cases, such as generating educational material, detailed reports, or customer communications. 1.2 Few-Shot Learning Few-shot learning is an advanced technique where the AI model is provided with a small number of examples (often just a few) within the prompt to guide its output. This method is particularly powerful when dealing with tasks that require the model to understand and replicate complex patterns or formats with minimal input. By incorporating examples directly into the prompt, you can train the model to produce more accurate and contextually appropriate responses even in scenarios where large amounts of training data are not available. This approach is invaluable in customizing the AI's behavior for niche applications where generalized training data may not suffice. Example Prompt: "Generate a product description for the following items. The description should be concise, highlight key features, and appeal to a target audience interested in quality and innovation. Example 1: Product Name: SmartHome WiFi Thermostat Description: "The SmartHome WiFi Thermostat is your ultimate solution for energy-efficient temperature control. With seamless integration into your smart home ecosystem, it offers intuitive controls, customizable settings, and real-time energy usage reports. Designed for modern living, it ensures comfort while helping you save on energy bills." Example 2: Product Name: EcoBrew Stainless Steel Coffee Maker Description: "The EcoBrew Stainless Steel Coffee Maker combines sustainability with top-tier performance. Made from durable stainless steel, this eco-friendly coffee maker brews the perfect cup every time, with minimal environmental impact. Ideal for the conscious coffee lover, it offers precision brewing and a sleek design that complements any kitchen." New Item: Product Name: UltraLight Travel Backpack Description: AI-Generated Output: "UltraLight Travel Backpack: The UltraLight Travel Backpack is engineered for the modern adventurer. Weighing just under a pound, it offers unmatched portability without sacrificing storage capacity. Featuring water-resistant fabric, multiple compartments, and ergonomic shoulder straps, this backpack is designed for those who need durability and convenience on the go. Perfect for day hikes, weekend getaways, or everyday use, the UltraLight Travel Backpack is your reliable companion on any journey." Explanation: In this example, the AI was given two product descriptions that demonstrate the desired style, tone, and structure of the output. When asked to generate a new description for the "UltraLight Travel Backpack," the AI used the patterns from the provided examples to create a similar, contextually appropriate product description. Despite only being shown two examples, the model effectively captured the key elements needed to generate a new, high-quality output. This approach is highly beneficial when you need the AI to produce consistent results across similar tasks, especially in scenarios where creating extensive training data is impractical. By providing just a few examples, you guide the AI's understanding, allowing it to apply the learned pattern to new, but related, tasks. 1.3 Chain of Thought The chain of thought patterns encourages the AI to generate responses that follow a logical sequence, mirroring human reasoning. This technique is particularly useful in complex scenarios where the AI needs to make decisions, solve problems, or explain concepts step-by-step. By structuring prompts that lead the AI through a series of thought processes, you can guide it to produce more coherent and rational outputs. This is especially effective in applications requiring detailed explanations, such as scientific reasoning, technical problem-solving, or any situation where the AI needs to justify its conclusions.For instance, a prompt might instruct the AI to break down a complex problem into smaller, manageable parts and tackle each one sequentially. The AI would first identify the key components of the problem, then work through each one, explaining its reasoning at each step. This method not only enhances the clarity of the response but also improves the accuracy and relevance of the AI’s conclusions. 2. Multi-modal Processing Multi-modal processing in Generative AI is a cutting-edge approach that allows AI systems to integrate and process multiple types of data—such as text, images, audio, and video—simultaneously. This capability is crucial for applications that require a deep understanding of content across different modalities, leading to more accurate and contextually enriched outputs. For instance, in a scenario where an AI is tasked with generating a description of a scene from a video, multi-modal processing enables it to analyze both the visual elements and the accompanying audio to produce a description that reflects not just what is seen but also the context provided by sound. Similarly, when processing text and images together, such as in a captioning task, the AI can better understand the relationship between the words and the visual content, leading to more precise and relevant captions. This advanced pattern is particularly beneficial in complex environments where understanding the nuances across different data types is key to delivering high-quality outputs. For example, in medical diagnostics, AI systems using multi-modal processing can analyze medical images alongside patient records and spoken notes to offer more accurate diagnoses. In customer service, AI can interpret and respond to customer queries by simultaneously analyzing text and voice tone, improving the quality of interactions. Moreover, multi-modal processing enhances the AI's ability to learn from varied data sources, allowing it to build more robust models that generalize better across different tasks. This makes it an essential tool in the development of AI applications that need to operate in real-world scenarios where data is rarely homogeneous. By leveraging multi-modal processing, AI systems can generate responses that are not only more comprehensive but also tailored to the specific needs of the task at hand, making them highly effective in a wide range of applications. As this technology continues to evolve, it promises to unlock new possibilities in fields as diverse as entertainment, education, healthcare, and beyond. Example In many situations, data may include both images and text that need to be analyzed together to gain comprehensive insights. To effectively process and integrate these different data types, you can utilize a multi-modal processing pipeline in SnapLogic. This approach allows the Generative AI model to simultaneously analyze data from both sources, maintaining the integrity of each modality. This pipeline is composed of two distinct stages. The first stage focuses on extracting images from the source data and converting them into base64 format. The second stage involves generating a prompt using advanced prompt engineering techniques, which is then fed into the Large Language Model (LLM). The visual representation of this process is divided into two parts, as shown in the picture above. Extract the image from the source Add the File Reader Snap: Drag and drop the “File Reader” Snap onto the designer. Configure the File Reader Snap: Click on the “File Reader” Snap to access its settings panel. Then, select a file that contains images. In this case, we select a pdf file. Add the PDF Parser Snap: Drag and drop the “PDF Parser” Snap onto the designer and set the parser type to be “Pages to images converter” Configure views: Click on the “Views” tab and then select the output to be “Binary”. Convert to Base64: Add and connect “Binary to Document” snap to the PDF Parser snap. Then, configure the encoding to ENCODE_BASE64. Construct the prompt and send it to the GenAI Add a JSON Generator Snap: Drag the JSON Generator Snap and connect it to the preceding Mapper Snap. Then, click “Edit JSON” to modify the JSON string in the JSON editor mode. AWS Claude on Message allows you to send images via the prompt by configuring the source attribute within the content. You can construct the image prompt as demonstrated in the screenshot. Provide instruction with Prompt Generator: Add the prompt generator Snap and connect it to the JSON Generator Snap. Next, select the “Advanced Prompt Output” checkbox to enable the advanced prompt payload. Finally, click “Edit Prompt” to enter your specific instructions. The advanced prompt output will be structured as an array of messages, as illustrated in the screenshot below. Send to GenAI: Add the AWS Claude on AWS Message Snap and enter your credentials to access the AWS Bedrock service. Ensure that the “Use Message Payload” checkbox is selected, and then configure the message payload using $messages, which is the output from the previous Snap. After completing these steps, you can process the image using the LLM independently. This approach allows the LLM to focus on extracting detailed information from the image. Once the image has been processed, you can then combine this data with other sources, such as text or structured data, to generate a more comprehensive and accurate analysis. This multi-modal integration ensures that the insights derived from different data types are effectively synthesized, leading to richer and more precise results. 3. Semantic Caching To optimize both the cost and response time associated with using Large Language Models (LLMs), implementing a semantic caching mechanism is a highly effective strategy. Semantic caching involves storing responses generated by the model and reusing them when the system encounters queries with the same or similar meanings. This approach not only enhances the overall efficiency of the system but also significantly reduces the operational costs tied to model usage. The fundamental principle behind semantic caching is that many user queries are often semantically similar, even if they are phrased differently. By identifying and caching the responses to these semantically equivalent queries, the system can bypass the need to repeatedly invoke the LLM, which is resource-intensive. Instead, the system can quickly retrieve and return the cached response, leading to faster response times and a more seamless user experience. From a cost perspective, semantic caching directly translates into savings. Each time the system serves a response from the cache rather than querying the LLM, it avoids the computational expense associated with generating a new response. This reduction in the number of LLM invocations directly correlates with lower service costs, making the solution more economically viable, particularly in environments with high query volumes. Additionally, semantic caching contributes to system scalability. As the demand on the LLM grows, the caching mechanism helps manage the load more effectively, ensuring that response times remain consistent even as the system scales. This is crucial for maintaining the quality of service, especially in real-time applications where latency is a critical factor. Implementing semantic caching as part of the LLM deployment strategy offers a dual benefit: optimizing response times for end-users and minimizing the operational costs of model usage. This approach not only enhances the performance and scalability of AI-driven systems but also ensures that they remain cost-effective and responsive as user demand increases. Implementation Concept for Semantic Caching Semantic caching is a strategic approach designed to optimize both response time and computational efficiency in AI-driven systems. The implementation of semantic caching involves the following key steps: Query Submission and Vectorization: When a user submits a query, the system first processes this input by converting it into an embedding—a vectorized representation of the query. This embedding captures the semantic meaning of the query, enabling efficient comparison with previously stored data. Cache Lookup and Matching: The system then performs a lookup in the vector cache, which contains embeddings of previous queries along with their corresponding responses. During this lookup, the system searches for an existing embedding that closely matches the new query's embedding. Matching Threshold: A critical component of this process is the match threshold, which can be adjusted to control the sensitivity of the matching algorithm. This threshold determines how closely the new query needs to align with a stored embedding for the cache to consider it a match. Cache Hit and Response Retrieval: If the system identifies a match within the defined threshold, it retrieves the corresponding response from the cache. This "cache hit" allows the system to deliver the response to the user rapidly, bypassing the need for further processing. By serving responses directly from the cache, the system conserves computational resources and reduces response times. Cache Miss and LLM Processing: In cases where no suitable match is found in the cache—a "cache miss"—the system forwards the query to the Large Language Model (LLM). The LLM processes the query and generates a new response, ensuring that the user receives a relevant and accurate answer even for novel queries. Response Storage and Cache Management: After the LLM generates a new response, the system not only delivers this response to the user but also stores the response along with its associated query embedding back into the vector cache. This step ensures that if a similar query is submitted in the future, the system can serve the response directly from the cache, further optimizing the system’s efficiency. Time-to-Live (TTL) Adjustment: To maintain the relevance and accuracy of cached responses, the system can adjust the Time-to-Live (TTL) for each entry in the cache. The TTL determines how long a response remains valid in the cache before it is considered outdated and automatically removed. By fine-tuning the TTL settings, the system ensures that only up-to-date and contextually appropriate responses are served, thereby preventing the use of stale or irrelevant data. Implement Semantic Caching in Snaplogic The concept of semantic caching can be effectively implemented within SnapLogic, leveraging its robust pipeline capabilities. Below is an outline of how this implementation can be achieved: Embedding the Query: The process begins with the embedding of the user’s query (prompt). Using SnapLogic's capabilities, an embedder, such as the Amazon Titan Embedder, is employed to convert the prompt into a vectorized representation. This embedding captures the semantic meaning of the prompt, making it suitable for comparison with previously stored embeddings. Vector Cache Lookup: Once the prompt has been embedded, the system proceeds to search for a matching entry in the vector cache. In this implementation, the Snowflake Vector Database serves as the vector cache, storing embeddings of past queries along with their corresponding responses. This lookup is crucial for determining whether a similar query has been processed before. Flow Routing with Router Snap: After the lookup, the system uses a Router Snap to manage the flow based on whether a match (cache hit) is found or not (cache miss). The Router Snap directs the workflow as follows: Cache Hit: If a matching embedding is found in the vector cache, the Router Snap routes the process to immediately return the cached response to the user. This ensures rapid response times by avoiding unnecessary processing. Cache Miss: If no match is found, the Router Snap directs the workflow to request a new response from the Large Language Model (LLM). The LLM processes the prompt and generates a new, relevant response. Storing and Responding: In the event of a cache miss, after the LLM generates a new response, the system not only sends this response to the user but also stores the new embedding and response in the Snowflake Vector Database for future use. This step enhances the efficiency of subsequent queries, as similar prompts can be handled directly from the cache. 4. Multiplexing AI Agents Multiplexing AI agents refers to a strategy where multiple generative AI models, each specialized in a specific task, are utilized in parallel to address complex queries. This approach is akin to assembling a panel of experts, where each agent contributes its expertise to provide a comprehensive solution. Here is the key feature of using multiplexing AI Agents Specialization: A central advantage of multiplexing AI agents is the specialization of each agent in handling specific tasks or domains. Multiplexing ensures that responses are more relevant and accurate by assigning each AI model to a particular area of expertise. For example, one agent might be optimized for natural language understanding, another for technical problem-solving, and a third for summarizing complex data. This allows the system to handle multi-dimensional queries effectively, as each agent focuses on what it does best. This specialization significantly reduces the likelihood of errors or irrelevant responses, as the AI agents are tailored to their specific tasks. In scenarios where a query spans multiple domains—such as asking a technical question with a business aspect—the system can route different parts of the query to the appropriate agent. This structured approach allows for extracting more relevant and accurate information, leading to a solution that addresses all facets of the problem. Parallel Processing: Multiplexing AI agents take full advantage of parallel processing capabilities. By running multiple agents simultaneously, the system can tackle different aspects of a query at the same time, speeding up the overall response time. This parallel approach enhances both performance and scalability, as the workload is distributed among multiple agents rather than relying on a single model to process the entire task. For example, in a customer support application, one agent could handle the analysis of a customer’s previous interactions while another agent generates a response to a technical issue, and yet another creates a follow-up action plan. Each agent works on its respective task in parallel, and the system integrates its outputs into a cohesive response. This method not only accelerates problem-solving but also ensures that different dimensions of the problem are addressed simultaneously. Dynamic Task Allocation: In a multiplexing system, dynamic task allocation is crucial for efficiently distributing tasks among the specialized agents. A larger, general-purpose model, such as AWS Claude 3 Sonet, can act as an orchestrator, assessing the context of the query and determining which parts of the task should be delegated to smaller, more specialized agents. The orchestrator ensures that each task is assigned to the model best equipped to handle it. For instance, if a user submits a complex query about legal regulations and data security, the general model can break down the query, sending legal-related questions to an AI agent specialized in legal analysis and security-related queries to a security-focused agent like TinyLlama or a similar model. This dynamic delegation allows for the most relevant models to be used at the right time, improving both the efficiency and accuracy of the overall response. Integration of Outputs: Once the specialized agents have processed their respective tasks, the system must integrate their outputs to form a cohesive and comprehensive response. This integration is a critical feature of multiplexing, as it ensures that all aspects of a query are addressed without overlap or contradiction. The system combines the insights generated by each agent, creating a final output that reflects the full scope of the user’s request. In many cases, the integration process also includes filtering or refining the outputs to remove any inconsistencies or redundancies, ensuring that the response is logical and cohesive. This collaborative approach increases the reliability of the system, as it allows different agents to complement one another’s knowledge and expertise. Additionally, multiplexing reduces the likelihood of hallucinations—incorrect or nonsensical outputs that can sometimes occur with single, large-scale models. By dividing tasks among specialized agents, the system ensures that each part of the problem is handled by an AI that is specifically trained for that domain, minimizing the chance of erroneous or out-of-context responses. Improved Accuracy and Contextual Understanding: Multiplexing AI agents contribute to improved overall accuracy by distributing tasks to models that are more finely tuned to specific contexts or subjects. This approach ensures that the AI system can better understand and address the nuances of a query, particularly when the input involves complex or highly specialized information. Each agent’s deep focus on a specific task leads to a higher level of precision, resulting in a more accurate final output. Furthermore, multiplexing allows the system to build a more detailed contextual understanding. Since different agents are responsible for different elements of a task, the system can synthesize more detailed and context-aware responses. This holistic view is crucial for ensuring that the solution provided is not only accurate but also relevant to the specific situation presented by the user. In SnapLogic, we offer comprehensive support for building advanced workflows by integrating our GenAI Builder Snap. This feature allows users to incorporate generative AI capabilities into their workflow automation processes seamlessly. By leveraging the GenAI Builder Snap, users can harness the power of artificial intelligence to automate complex decision-making, data processing, and content generation tasks within their existing workflows. This integration provides a streamlined approach to embedding AI-driven functionalities, enhancing both efficiency and precision across various operational domains. For instance, users can design workflows where the GenAI Builder Snap collaborates with other SnapLogic components, such as data pipelines and transformation processes, to deliver intelligent, context-aware automation tailored to their unique business needs. In the example pipelines, the system sends a prompt simultaneously to multiple AI agents, each with its specialized area of expertise. These agents independently process the specific aspects of the prompt related to their specialization. Once the agents generate their respective outputs, the results are then joined together to form a cohesive response. To further enhance the clarity and conciseness of the final output, a summarization agent is employed. This summarization agent aggregates and refines the detailed responses from each specialized agent, distilling the information into a concise, unified summary that captures the key points from all the agents, ensuring a coherent and well-structured final response. 5. Multi-agent conversation Multi-agent conversation refers to the interaction and communication between multiple autonomous agents, typically AI systems, working together to achieve a shared goal. This framework is widely used in areas like collaborative problem-solving, multi-user systems, and complex task coordination where multiple perspectives or expertise areas are required. Unlike a single-agent conversation, where one AI handles all inputs and outputs, a multi-agent system divides tasks among several specialized agents, allowing for greater efficiency, deeper contextual understanding, and enhanced problem-solving capabilities. Here are the key features of using multi-agent conversations. Specialization and Expertise: Each agent in a multi-agent system is designed with a specific role or domain of expertise. This allows the system to leverage agents with specialized capabilities to handle different aspects of a task. For example, one agent might focus on natural language processing (NLP) to understand input, while another might handle complex calculations or retrieve data from external sources. This division of labor ensures that tasks are processed by the most capable agents, leading to more accurate and efficient results. Specialization reduces the likelihood of errors and allows for a deeper, domain-specific understanding of the problem. Collaboration and Coordination: In a multi-agent conversation, agents don’t work in isolation—they collaborate to achieve a shared goal. Each agent contributes its output to the broader conversation, sharing information and coordinating actions to ensure that the overall task is completed successfully. This collaboration is crucial when handling complex problems that require input from multiple domains. Effective coordination ensures that agents do not duplicate work or cause conflicts. Through predefined protocols or negotiation mechanisms, agents are able to work together harmoniously, producing a coherent solution that integrates their various inputs. Scalability: Multi-agent systems are inherently scalable, making them ideal for handling increasingly complex tasks. As the system grows in complexity or encounters new challenges, additional agents with specific skills can be introduced without overloading the system. Each agent can work independently, and the system's modular design allows for smooth expansion. Scalability ensures that the system can handle larger datasets, more diverse inputs, or more complex tasks as the environment evolves. This adaptability is essential in dynamic environments where workloads or requirements change over time. Distributed Decision-Making: In a multi-agent system, decision-making is often decentralized, meaning each agent has the autonomy to make decisions based on its expertise and the information available to it. This distributed decision-making process allows agents to handle tasks in parallel, without needing constant oversight from a central controller. Since agents can operate independently, decisions are made more quickly, and bottlenecks are avoided. This decentralized approach also enhances the system's resilience, as it avoids over-reliance on a single decision point and enables more adaptive and localized problem-solving. Fault Tolerance and Redundancy: Multi-agent systems are naturally resilient to errors and failures. Since each agent operates independently, the failure of one agent does not disrupt the entire system. Other agents can continue their tasks or, if necessary, take over the work of a failed agent. This built-in redundancy ensures the system can continue functioning even when some agents encounter issues. Fault tolerance is particularly valuable in complex systems, as it enhances reliability and minimizes downtime, allowing the system to maintain performance even under adverse conditions. SnapLogic provides robust capabilities for integrating workflow automation with Generative AI (GenAI), allowing users to seamlessly build advanced multi-agent conversation systems by combining the GenAI Snap with other Snaps within their pipeline. This integration enables users to create sophisticated workflows where multiple AI agents, each with their specialization, collaborate to process complex queries and tasks. In this example, we demonstrate a simple implementation of a multi-agent conversation system, leveraging a manager agent to oversee and control the workflow. The process begins by submitting a prompt to a large foundational model, which, in this case, is AWS Claude 3 Sonet. This model acts as the manager agent responsible for interpreting the prompt and determining the appropriate routing for different parts of the task. Based on the content and context of the prompt, the manager agent makes decisions on how to distribute the workload across specialized agents. After the initial prompt is processed, we utilize the Router Snap to dynamically route the output to the corresponding specialized agents. Each agent is tailored to handle a specific domain or task, such as data analysis, natural language processing, or knowledge retrieval, ensuring that the most relevant and specialized agent addresses each part of the query. Once the specialized agents have completed their respective tasks, their outputs are gathered and consolidated. The system then sends the final, aggregated result to the output destination. This approach ensures that all aspects of the query are addressed efficiently and accurately, with each agent contributing its expertise to the overall solution. The flexibility of SnapLogic’s platform, combined with the integration of GenAI models and Snaps, makes it easy for users to design, scale, and optimize complex multi-agent conversational workflows. By automating task routing and agent collaboration, SnapLogic enables more intelligent, scalable, and context-aware solutions for addressing a wide range of use cases, from customer service automation to advanced data processing. 6. Retrieval Augment Generation (RAG) To enhance the specificity and relevance of responses generated by a Generative AI (GenAI) model, it is crucial to provide the model with sufficient context. Contextual information helps the model understand the nuances of the task at hand, enabling it to generate more accurate and meaningful outputs. However, in many cases, the amount of context needed to fully inform the model exceeds the token limit that the model can process in a single prompt. This is where a technique known as Retrieval-Augmented Generation (RAG) becomes particularly valuable. RAG is designed to optimize the way context is fed into the GenAI model. Rather than attempting to fit all the necessary information into the limited input space, RAG utilizes a retrieval mechanism that dynamically sources relevant information from an external knowledge base. This approach allows users to overcome the token limit challenge by fetching only the most pertinent information at the time of query generation, ensuring that the context provided to the model remains focused and concise. The RAG framework can be broken down into two primary phases: Embedding Knowledge into a Vector Database: In the initial phase, the relevant content is embedded into a vector space using a machine learning model that transforms textual data into a format conducive to similarity matching. This embedding process effectively converts text into vectors, making it easier to store and retrieve later based on its semantic meaning. Once embedded, the knowledge is stored in a vector database for future access. In SnapLogic, embedding knowledge into a vector database can be accomplished through a streamlined pipeline designed for efficiency and scalability. The process begins with reading a PDF file using the File Reader Snap, followed by extracting the content with the PDF Parser Snap, which converts the document into a structured text format. Once the text is available, the Chunker Snap is used to intelligently segment the content into smaller, manageable chunks. These chunks are specifically sized to align with the input constraints of the model, ensuring optimal performance during later stages of retrieval. After chunking the text, each segment is processed and embedded into a vector representation, which is then stored in the vector database. This enables efficient similarity-based retrieval, allowing the system to quickly access relevant pieces of information as needed. By utilizing this pipeline in SnapLogic, users can easily manage and store large volumes of knowledge in a way that supports high-performance, context-driven AI applications. Retrieving Context through Similarity Matching: When a query is received, the system performs similarity matching to retrieve the most relevant content from the vector database. By evaluating the similarity between the embedded query and the stored vectors, RAG identifies the most pertinent pieces of information, which are then used to augment the input prompt. This step ensures that the GenAI model receives focused and contextually enriched data, allowing it to generate more insightful and accurate responses. To retrieve relevant context from the vector database in SnapLogic, users can leverage an embedder snap, such as the AWS Titan Embedder, to transform the incoming prompt into a vector representation. This vector serves as the key for performing a similarity-based search within the vector database where the previously embedded knowledge is stored. The vector search mechanism efficiently identifies the most relevant pieces of information, ensuring that only the most contextually appropriate content is retrieved. Once the pertinent knowledge is retrieved, it can be seamlessly integrated into the overall prompt-generation process. This is typically achieved by feeding the retrieved context into a prompt generator snap, which structures the information in a format optimized for use by the Generative AI model. In this case, the final prompt, enriched with the relevant context, is sent to the GenAI Snap, such as Anthropic Claude within the AWS Messages Snap. This approach ensures that the model receives highly specific and relevant information, ultimately enhancing the accuracy and relevance of its generated responses. By implementing RAG, users can fully harness the potential of GenAI models, even when dealing with complex queries that demand a significant amount of context. This approach not only enhances the accuracy of the model's responses but also ensures that the model remains efficient and scalable, making it a powerful tool for a wide range of real-world applications. 7. Tool Calling and Contextual instruction Traditional GenAI models are limited by the data they were trained on. Once trained, these models cannot access new or updated information unless they are retrained. This limitation means that without external input, models can only generate responses based on the static content within their training corpus. However, in a world where data is constantly evolving, relying on static knowledge is often inadequate, especially for tasks that require current or real-time information. In many real-world applications, Generative AI (GenAI) models need access to real-time data to generate contextually accurate and relevant responses. For example, if a user asks for the current weather in a particular location, the model cannot rely solely on pre-trained knowledge, as this data is dynamic and constantly changing. In such scenarios, traditional prompt engineering techniques are insufficient, as they primarily rely on static information that was available at the time of the model's training. This is where the tool-calling technique becomes invaluable. Tool calling refers to the ability of a GenAI model to interact with external tools, APIs, or databases to retrieve specific information in real-time. Instead of relying on its internal knowledge, which may be outdated or incomplete, the model can request up-to-date data from external sources and use it to generate a response that is both accurate and contextually relevant. This process significantly expands the capabilities of GenAI, allowing it to move beyond static, pre-trained content and incorporate dynamic, real-world data into its responses. For instance, when a user asks for live weather updates, stock market prices, or traffic conditions, the GenAI model can trigger a tool call to an external API—such as a weather service, financial data provider, or mapping service—to fetch the necessary data. This fetched data is then integrated into the model’s response, enabling it to provide an accurate and timely answer that would not have been possible using static prompts alone. Contextual instruction plays a critical role in the tool calling process. Before calling an external tool, the GenAI model must understand the nature of the user’s request and identify when external data is needed. For example, if a user asks, "What is the weather like in Paris right now?" the model recognizes that the question requires real-time weather information and that this cannot be answered based on internal knowledge alone. The model is thus programmed to trigger a tool call to a relevant weather service API, retrieve the live weather data for Paris, and incorporate it into the final response. This ability to understand and differentiate between static knowledge (which can be answered with pre-trained data) and dynamic, real-time information (which requires external tool calling) is essential for GenAI models to operate effectively in complex, real-world environments. Use Cases for Tool Calling Real-Time Data Retrieval: GenAI models can call external APIs to retrieve real-time data such as weather conditions, stock prices, news updates, or live sports scores. These tool calls ensure that the AI provides up-to-date and accurate responses that reflect the latest information. Complex Calculations and Specialized Tasks: Tool calling allows AI models to handle tasks that require specific calculations or domain expertise. For instance, an AI model handling a financial query can call an external financial analysis tool to perform complex calculations or retrieve historical stock market data. Integration with Enterprise Systems: In business environments, GenAI models can interact with external systems such as CRM platforms, ERP systems, or databases to retrieve or update information in real time. For example, a GenAI-driven customer service bot can pull account information from a CRM system or check order statuses from an external order management tool. Access to Specialized Knowledge: Tool calling allows AI models to fetch specialized information from databases or knowledge repositories that fall outside their domain of training. For example, a medical AI assistant could call an external database of medical research papers to provide the most current treatment options for a particular condition. Implementation of Tool Calling in Generative AI Systems Tool calling has become an integral feature in many advanced Generative AI (GenAI) models, allowing them to extend their functionality by interacting with external systems and services. For instance, AWS Anthropic Claude supports tool calling via the Message API, providing developers with a structured way to integrate external data and functionality directly into the model's response workflow. This capability allows the model to enhance its responses by incorporating real-time information, performing specific functions, or utilizing external APIs that provide specialized data beyond the model's training. To implement tool calling with AWS Anthropic Claude, users can leverage the Message API, which allows for seamless integration with external systems. The tool calling mechanism is activated by sending a message with a specific "tools" parameter. This parameter defines how the external tool or API will be called, using a JSON schema to structure the function call. This approach enables the GenAI model to recognize when external input is required and initiate a tool call based on the instructions provided. Implementation process Defining the Tool Schema: To initiate a tool call, users need to send a request with the "tools" parameter. This parameter is defined in a structured JSON schema, which includes details about the external tool or API that the GenAI model will call. The JSON schema outlines how the tool should be used, including the function name, parameters, and any necessary inputs for making the call. For example, if the tool is a weather API, the schema might define parameters such as location and time, allowing the model to query the API with these inputs to retrieve current weather data. Message Structure and Request Initiation: Once the tool schema is defined, the user can send a message to AWS Anthropic Claude containing the "tools" parameter alongside the prompt or query. The model will then interpret the request and, based on the context of the conversation or task, determine if it needs to call the external tool specified in the schema. If a tool call is required, the model will respond with a "stop_reason" value of "tool_use". This response indicates that the model is pausing its generation to call the external tool, rather than completing the response using only its internal knowledge. Tool Call Execution: When the model responds with "stop_reason": "tool_use", it signals that the external API or function should be called with the inputs provided. At this point, the external API (as specified in the JSON schema) is triggered to fetch the required data or perform the designated task. For example, if the user asks, "What is the weather in New York right now?", and the JSON schema defines a weather API tool, the model will pause and call the API with the location parameter set to "New York" and the time parameter set to "current." Handling the API Response: After the external tool processes the request and returns the result, the user (or system) sends a follow-up message containing the "tool_result". This message includes the output from the tool call, which can then be integrated into the ongoing conversation or task. In practice, this might look like a weather API returning a JSON object with temperature, humidity, and weather conditions. The response is passed back to the GenAI model via a user message, which contains the "tool_result" data. Final Response Generation: Once the model receives the "tool_result", it processes the data and completes the response. This allows the GenAI model to provide a final answer that incorporates real-time or specialized information retrieved from the external system. In our weather example, the final response might be, "The current weather in New York is 72°F with clear skies." Currently, SnapLogic does not yet provide native support for tool calling within the GenAI Snap Pack. However, we recognize the immense potential and value this feature can bring to users, enabling seamless integration with external systems and services for real-time data and advanced functionalities. We are actively working on incorporating tool calling capabilities into future updates of the platform. This enhancement will further empower users to build more dynamic and intelligent workflows, expanding the possibilities of automation and AI-driven solutions. We are excited about the potential it holds and look forward to sharing these innovations soon 8. Memory Cognition for LLMs Most large language models (LLMs) operate within a context window limitation, meaning they can only process and analyze a finite number of tokens (words, phrases, or symbols) at any given time. This limitation poses significant challenges, particularly when dealing with complex tasks, extended dialogues, or interactions that require long-term contextual understanding. For example, if a conversation or task extends beyond the token limit, the model loses awareness of earlier portions of the interaction, leading to responses that may become disconnected, repetitive, or contextually irrelevant. This limitation becomes especially problematic in applications where maintaining continuity and coherence across long interactions is crucial. In customer service scenarios, project management tools, or educational applications, it is often necessary to remember detailed information from earlier exchanges or to track progress over time. However, traditional models constrained by a fixed token window struggle to maintain relevance in such situations, as they are unable to "remember" or access earlier parts of the conversation once the context window is exceeded. To address these limitations and enable LLMs to handle longer and more complex interactions, we employ a technique known as memory cognition. This technique extends the capabilities of LLMs by introducing mechanisms that allow the model to retain, recall, and dynamically integrate past interactions or information, even when those interactions fall outside the immediate context window. Memory Cognition Components in Generative AI Applications To successfully implement memory cognition in Generative AI (GenAI) applications, a comprehensive and structured approach is required. This involves integrating various memory components that work together to enable the AI system to retain, retrieve, and utilize relevant information across different interactions. Memory cognition enables the AI model to go beyond stateless, short-term processing, creating a more context-aware, adaptive, and intelligent system capable of long-term interaction and decision-making. Here are the key components of memory cognition that must be considered when developing a GenAI application: Short-Term Memory (Session Memory) Short-term memory, commonly referred to as session memory, encompasses the model's capability to retain context and information during a single interaction or session. This component is vital for maintaining coherence in multi-turn conversations and short-term tasks. It enables the model to sustain continuity in its responses by referencing earlier parts of the conversation, thereby preventing the user from repeating previously provided information. Typically, short-term memory is restricted to the duration of the interaction. Once the session concludes or a new session begins, the memory is either reset or gradually decayed. This ensures the model can recall relevant details from earlier in the same session, creating a more seamless and fluid conversational experience. For example, in a customer service chatbot, short-term memory allows the AI to remember a customer’s issue throughout the conversation, ensuring that the problem is consistently addressed without needing the user to restate it multiple times. However, in large language models, short-term memory is often limited by the model's context window, which is constrained by the maximum number of tokens it can process in a single prompt. As new input is added during the conversation, older dialogue parts may be discarded or forgotten, depending on the token limit. This necessitates careful management of short-term memory to ensure that critical information is retained throughout the session. Long-Term Memory Long-term memory significantly enhances the model's capability by allowing it to retain information beyond the scope of a single session. Unlike short-term memory, which is confined to a single interaction, long-term memory persists across multiple interactions, enabling the AI to recall important information about users, their preferences, past conversations, or task-specific details, regardless of the time elapsed between sessions. This type of memory is typically stored in an external database or knowledge repository, ensuring it remains accessible over time and does not expire when a session ends. Long-term memory is especially valuable in applications that require the retention of critical or personalized information, such as user preferences, history, or recurring tasks. It allows for highly personalized interactions, as the AI can reference stored information to tailor its responses based on the user's previous interactions. For example, in virtual assistant applications, long-term memory enables the AI to remember a user's preferences—such as their favorite music or regular appointment times—and use this information to provide customized responses and recommendations. In enterprise environments, such as customer support systems, long-term memory enables the AI to reference previous issues or inquiries from the same user, allowing it to offer more informed and tailored assistance. This capability enhances the user experience by reducing the need for repetition and improving the overall efficiency and effectiveness of the interaction. Long-term memory, therefore, plays a crucial role in enabling AI systems to deliver consistent, contextually aware, and personalized responses across multiple sessions. Memory Management Dynamic memory management refers to the AI model’s ability to intelligently manage and prioritize stored information, continuously adjusting what is retained, discarded, or retrieved based on its relevance to the task at hand. This capability is crucial for optimizing both short-term and long-term memory usage, ensuring that the model remains responsive and efficient without being burdened by irrelevant or outdated information. Effective dynamic memory management allows the AI system to adapt its memory allocation in real-time, based on the immediate requirements of the conversation or task. In practical terms, dynamic memory management enables the AI to prioritize important information, such as key facts, user preferences, or contextually critical data, while discarding or de-prioritizing trivial or outdated details. For example, during an ongoing conversation, the system may focus on retaining essential pieces of information that are frequently referenced or highly relevant to the user’s current query, while allowing less pertinent information to decay or be removed. This process ensures that the AI can maintain a clear focus on what matters most, enhancing both accuracy and efficiency. To facilitate this, the system often employs relevance scoring mechanisms to evaluate and rank the importance of stored memories. Each piece of memory can be assigned a priority score based on factors such as how frequently it is referenced or its importance to the current task. Higher-priority memories are retained for longer periods, while lower-priority or outdated entries may be marked for removal. This scoring system helps prevent memory overload by ensuring that only the most pertinent information is retained over time. Dynamic memory management also includes memory decay mechanisms, wherein older or less relevant information gradually "fades" or is automatically removed from storage, preventing memory bloat. This ensures that the AI retains only the most critical data, avoiding inefficiencies and ensuring optimal performance, especially in large-scale applications that involve substantial amounts of data or memory-intensive operations. To further optimize resource usage, automated processes can be implemented to "forget" memory entries that have not been referenced for a significant amount of time or are no longer relevant to ongoing tasks. These processes ensure that memory resources, such as storage and processing power, are allocated efficiently, particularly in environments with large-scale memory requirements. By dynamically managing memory, the AI can continue to provide contextually accurate and timely responses while maintaining a balanced and efficient memory system. Implementation of memory cognition in Snaplogic SnapLogic provides robust capabilities for integrating with databases and storage systems, making it an ideal platform for creating workflows to manage memory cognition in AI applications. In the following example, we demonstrate a basic memory cognition pattern using SnapLogic to handle both short-term and long-term memory. Overview of the Workflow The workflow begins by embedding the prompt into a vector representation. This vector is then used to retrieve relevant memories from long-term memory storage. Long-term memory can be stored in a vector database, which is well-suited for similarity-based retrieval, or in a traditional database or key-value store, depending on the application requirements. Similarly, short-term memory can be stored in a regular database or a key-value store to keep track of recent interactions. Retrieving Memories Once the prompt is embedded, we retrieve relevant information from both short-term and long-term memory systems. The retrieval process is based on similarity scoring, where the similarity score indicates the relevance of the stored memory to the current prompt. For long-term memory, this typically involves querying a vector database, while short-term memory may be retrieved from a traditional relational database or key-value store. After retrieving the relevant memories from both systems, the data is fed into a memory management module. In this example, we implement a simple memory management mechanism using a script within SnapLogic. Memory Management The memory management module employs a sliding window technique, which is a straightforward yet effective way to manage memory. As new memory is added, older memories gradually fade out until they are removed from the memory stack. This ensures that the AI retains the most recent and relevant information while discarding outdated or less useful memories. The sliding window mechanism prioritizes newer or more relevant memories, placing them at the top of the memory stack, while older memories are pushed out over time. Generating the Final Prompt and Interacting with the LLM Once the memory management module has constructed the full context by combining short-term and long-term memory, the system generates the final prompt. This prompt is then sent to the language model for processing. In this case, we use AWS Claude through the Message API as the large language model (LLM) to generate a response based on the provided context. Updating Memory Upon receiving a response from the LLM, the workflow proceeds to update both short-term and long-term memory systems to ensure continuity and relevance in future interactions: Long-Term Memory: The long-term memory is refreshed by associating the original prompt with the LLM's response. In this context, the query key corresponds to the initial prompt, while the value is the response generated by the model. This update enables the system to store pertinent knowledge that can be accessed during future interactions, allowing for more informed and contextually aware responses over time. Short-Term Memory: The short-term memory is updated by appending the LLM's response to the most recent memory stack. This process ensures that the immediate context of the current conversation is maintained, allowing for seamless transitions and consistency in subsequent interactions within the session. This example demonstrates how SnapLogic can be effectively used to manage memory cognition in AI applications. By integrating with databases and leveraging SnapLogic’s powerful workflow automation, we can create an intelligent memory management system that handles both short-term and long-term memory. The sliding window mechanism ensures that the AI remains contextually aware while avoiding memory overload, and AWS Claude provides the processing power to generate responses based on rich contextual understanding. This approach offers a scalable and flexible solution for managing memory cognition in AI-driven workflows.1.8KViews4likes0CommentsRecipes for Success with SnapLogic’s GenAI App Builder: From Integration to Automation
For this episode of the Enterprise Alchemists podcast, Guy and Dominic invited Aaron Kesler and Roger Sramkoski to join them to discuss why SnapLogic's GenAI App Builder is the key to success with AI projects. Aaron is the Senior Product Manager for all things AI at SnapLogic, and Roger is a Senior Technical Product Marketing Manager focused on AI. We kept things concrete, discussing real-world results that early adopters have already been able to deliver by using SnapLogic's integration capabilities to power their new AI-driven experiences.2.3KViews4likes2CommentsGenAI App Builder Getting Started Series: Part 2 - Purchase Order Processing
👋 Welcome! Hello everyone and welcome to our second guide in the GenAI App Builder Getting Started Series! First things first, GenAI App Builder is now generally available for all customers to purchase or test in SnapLabs. If you are a customer or partner who wants access to SnapLabs, please reach out to your Customer Success Manager and they can grant you access. If you are not yet a customer, you can check out our GenAI App Builder videos then when you’re ready to take the next step, request a demo with our sales team! 🤔 What is GenAI App Builder? If you’re coming here from Part 1, you may notice that GenAI Builder is now GenAI App Builder. Thank you to our customers who shared feedback on how we could improve the name to better align with the purpose. The original name had led to some confusion that its purpose was to train LLMs. 📑 Purchase Order Processing Example In this example we will demonstrate how to use GenAI in a SnapLogic Pipeline to act like a function written in natural language to extract information from a PDF. The slide below shows an example of how we use natural language to extract the required fields in JSON format that would allow us to make this small pattern part of a larger app or data integration workflow. ✅ Prerequisites In order to following along with this guide, you will need the items below to complete this guide: Access to GenAI App Builder (in your company’s organization or in SnapLabs) Your own API account with access to Azure OpenAI, OpenAI, Amazon Bedrock Claude. ⬆️ Import the pipeline At the bottom of this post you will find several files if you want to just use a pattern to see this in action in your own environment and explore it further. If you are familiar with SnapLogic and want to build the Pipeline on your own you can do that as well and just download the example PDF or try your own! PurchaseOrderExample.pdf InvoiceProcessing_CommunityArticlePipeline_2024_06_28.slp (zipped) Once you are signed in to SnapLogic or SnapLabs you can start with the steps below to import the Pipeline: In Designer, click the icon shown in the screenshot below to import the Pipeline: Select the file in the File Browser window that pops up In the Add New Pipeline panel that opens you can change name and project location if desired Press the Save button in the lower-right corner 🚧 Parsing the file If you imported the pipeline using the steps above, then your pipeline should look like the one below. The steps below assume you imported the pipeline. If you are familiar enough with SnapLogic to build this on your own you can drag the Snaps shown below to create the Pipeline then follow along with us. 🔈 NOTE: The instructions here are completed with the Amazon Bedrock Prompt Generator and the Anthropic Claude on AWS for the last two Snaps in the Pipeline. You can swap these out for Azure OpenAI or OpenAI Snaps if you prefer to use those LLMs. Click the File Reader Snap to open its settings Click the icon at the far right of the File field as shown in the screenshot below Click the Upload File button in the upper-right corner of the window that pops up Select the PDF file from your file browser (download the file “” at the bottom of this post if you have not already) Save and close the File Reader Snap once your file is selected No edits are needed for the PDF Parser Snap, so we'll skip over that one Click the Mapper Snap Add $text in the Expression field and $context in the Target path fields as shown below Save and close the Mapper Snap Click on the fourth Snap, the Prompt Generator Snap (we will demonstrate here with the Amazon Bedrock Prompt Generator Snap - you do not have to use Amazon Bedrock though, you can any of the other LLM Prompt Generators we have like Azure OpenAI, OpenAI, etc.) Click the Edit Prompt button as shown in the screenshot below so we can modify the prompt used for the LLM You should see a pre-generated prompted like the one below: Copy the prompt below and replace the default prompt: Instruction: Your task is to pull out the company name, the date created, date shipped, invoice number, P.O. number, vendor from vendor details, recipient name from recipient details, subtotal, 'Shipping & handling', tax rate, sales tax, and total from the context below. Give the results back in JSON. Context: {{context}} The Prompt Generator text should now look like the screenshot below: Click the Ok button in the lower-right corner to save our prompt changes Click on the last Snap, the Chat Completions Snap (we will demonstrate here with the Anthropic Claude on AWS Chat Completions Snap - you do not have to use Anthropic Claude on AWS though, you can any of the other LLM Chat Completions Snaps we have like Azure OpenAI, OpenAI, etc.) Click the Account tab Click Add Account; if you have an existing LLM account to use you can select that here and skip to step 22 below Select the type of account you want then press Continue - available options will depend on which LLM Chat Completions Snap you chose Enter in the required credentials for the LLM account you chose; here is an example of the Amazon Bedrock Account Press the Apply button when done entering the credentials Verify your account is now selected in the Account tab Click on the Settings Click on the Suggest icon to the right of the Model name field as shown in the screenshot below and select the model you want to use Type $prompt in the Prompt field as shown in the screenshot below: Expand the Model Parameters section by clicking on it (if you are using OpenAI or Azure OpenAI, you can leave Maximum Tokens blank; for Anthropic Claude on AWS you will need to increase Maximum Tokens from 200 to something higher; you can see where we set 50,000 below) Save and close the Chat Completions Snap 🎬 Testing our example At this point we are ready to test our Pipeline and observe the results! The screenshot below shows you where you can click to Validate the Pipeline, which should have every Snap turn green with preview output as shown below. If you have any errors or questions, please reply to share them with us! Here is the JSON output after the Anthropic Claude on AWS Chat Completions Snap (note that other LLMs will have different API output structures): Extras! Want to play with this further? Try adding a Copy Snap after the Mapper and sending the file to multiple LLMs at once then review the results. Try changing {{context}} in the Prompt Generator to something else so you can drop the Mapper from the pipeline 🏁 Wrapping up Congratulations, you have now completed at least one GenAI App Builder integration in SnapLogic! 😎 Stay tuned to the SnapLabs channel here in the Integration Nation for more content on GenAI App Builder in the future! Please share any thoughts, comments, concerns, or feedback in a reply or DM RogerSramkoski!1.8KViews4likes0CommentsBuilding an AI Agent with SnapLogic AgentCreator using OpenAI and Microsoft Teams - Part 2
Integrating the AI Agent with Microsoft Teams via Azure Bot Service The first part covered the creation of our agent's architecture using SnapLogic pipelines and AgentCreator. Now, we focus on connecting that pipeline to Microsoft Teams so end users can chat with it. This involves creating and configuring the Azure Bot Service as a bridge between Teams and our SnapLogic pipelines. We will walk through the prerequisites and setup. Prerequisites for the Azure Bot Integration To integrate the SnapLogic agent with Teams, ensure you have the following prerequisites in place: SnapLogic AgentCreator and pipelines: A SnapLogic environment where AgentCreator is enabled. The Weather Agent pipelines can be used as a working example. You’ll also need to create the AgentDriver pipeline as a Triggered Task ( to obtain an endpoint URL accessible by the bot ). SnapLogic OAuth2 Account: An OAuth2 Account which will be used in an HTTP Client to send the assistant response back to the user. Also used for simulating "typing" indicator in Teams chat between tool usage. Microsoft 365 Tenant with Teams: Access to a Microsoft tenant where you have permission to register applications and upload custom Teams apps. You’ll need a Teams environment to test the bot ( this could be a corporate tenant or a developer tenant ). Azure Subscription: An Azure account with an active subscription to create resources ( specifically, an Azure Bot Service ). Also, ensure you have the Azure Bot Channels Registration or Azure Bot resource creation rights. Azure AD App Registration: Credentials for the bot. We will register an application in Azure Active Directory to represent our bot ( this provides a Client ID and Client Secret that will be used by the Bot Service to authenticate ). Azure Bot Service resource: We will create an Azure Bot which will tie together the app registration and our messaging endpoint, and allow adding Teams as a channel. Register an App in Azure AD for the Bot The first step is to register an Azure AD application that will identify our bot and provide authentication to Azure Bot Service and Teams. Create App registration: In the Azure Portal, navigate to Azure Active Directory > App Registrations and click "New registration". Give the app a name. For supported account types, you can choose "Accounts in this organizational directory only" ( Single tenant ) for simplicity, since this bot is intended for your organization’s Teams. You do not need to specify a Redirect URI for this scenario. Finalize registration: Click Register to create the app. Once created, you’ll see the Application ( Client ) ID – copy this ID, as we’ll need it later as the Bot ID and in the OAuth2 account. Create a client secret: In your new app’s overview, go to Certificates & secrets. Click "New client secret" to generate a secret key. Give it a description and a suitable expiration period. After saving, copy the Value of the client secret ( it will be a long string ). Save this secret somewhere secure now – you won’t be able to retrieve it again after you leave the page. We’ll provide this secret to the Bot Service so it can authenticate as this app and we will also use it in the OAuth2 account in SnapLogic. Gather Tenant ID: Since we chose a single-tenant app, we’ll also need the Azure AD tenant ID. You can find this in Overview of the app. Copy the tenant ID as well for later use. At this point, you should have: Client ID ( application ID ) for the bot and the OAuth2 account Client secret for the bot ( stored securely ) and the OAuth2 account Tenant ID of our Azure AD These will be used when setting up the Azure Bot Service so that it knows about this app registration. Create an OAuth2 account Now that we have the client id, client secret and tenant id all gathered from the app registration, we can create the OAuth2 account which will be used in an HTTP Client snap that will send the "typing" indicator as well as the response from the agent. Navigate to the "Manager" tab and locate your project folder where the agent pipelines are stored On the right side, click on the "+" icon to create a new account Choose "API Suite > OAuth2 Account" Populate the client id and client secret values from your app registration process Check the 'Send client data as Basic Auth header' and 'Header authenticated' settings Populate the authorization and token endpoints OAuth2 authorization endpoint: https://login.microsoftonline.com/<TENANT_ID>/oauth2/v2.0/authorize Oauth2 token endpoint: https://login.microsoftonline.com/<TENANT_ID>/oauth2/v2.0/token Change the "Grant type" to "client_credentials Add scope to both "Token endpoint config" and "Authorization endpoint config", the scope in our case is the following: https://api.botframework.com/.default Check "Auto-refresh token" Click "Authorize", if everything was set correctly on the previous step you should get redirected back to SnapLogic with a valid access token Example of an already configured OAuth2 account Create the Azure Bot Service and Connect to SnapLogic With the Azure AD app ready, we can create the actual bot resource that will connect to Teams and our SnapLogic endpoint: Add Azure Bot resource: In the Azure Portal, search for "Azure Bot" Service and select Azure Bot. Choose Create to make a new Bot resource. Configure Bot Settings: On the creation form, fill in: Bot handle: A unique name for your bot. Subscription and Resource Group: Select your Azure subscription and a resource group to contain the bot resource. Location: Pick a region. Pricing tier: Choose the Free tier if ( F0 ) – it’s more than sufficient for development and basic usage. Microsoft App ID: Here, reuse the existing App Registration we created. There should be an option to choose an existing app – provide the Client ID of the app registration. This links the bot resource to our AD app. App type: Select Single Tenant since our app registration is single-tenant. You might also need to provide the App secret ( Client Secret ) for the bot here during creation. Create the Bot: Click Review + create and then Create to provision the bot service. Azure will deploy the bot resource. Once completed, go to the resource’s page. Configure messaging endpoint: This is a crucial step – we must point the bot to our SnapLogic pipeline. In the Azure Bot resource settings, find the Settings menu and navigate to Configuration. Populate the field for Messaging endpoint ( the URL that the bot will call when a message is received ). Here, paste the Trigger URL of your WeatherAgent_AgentDriver pipeline. To get this URL: In SnapLogic, you would have already created the AgentDriver pipeline as a Triggered Task. That generates an endpoint URL: https://elastic.snaplogic.com/api/1/rest/slschedule/<org>/<proj>/WeatherAgent_AgentDriver Example endpoint with an appended authorization query param as bearer_token: https://elastic.snaplogic.com/api/1/rest/slschedule/myOrg/WeatherProject/WeatherAgent_AgentDriver?bearer_token=<bearer token> Enter the URL exactly as given by SnapLogic + Bearer token value Save the configuration. Now, when the Teams user messages the bot, Azure will send an HTTPS POST to this SnapLogic URL. Add Microsoft Teams channel: Still in the Azure Bot resource, go to Channels. Add a new channel and select Microsoft Teams. This step registers the bot with Teams so that Teams clients can use it. Now our bot service is set up with the SnapLogic pipeline as its backend. The AgentDriver pipeline is effectively the bot’s webhook. The Azure Bot resource handles authentication with Teams and will forward user messages to SnapLogic and relay SnapLogic’s responses back to Teams. Packaging the bot for Teams ( App manifest ) At this stage, the bot exists in Azure, but to use it in Teams we need to package it as a Teams app, especially if we want to share it within the organization. This involves creating a Teams app manifest and icons, then uploading it to Teams. Prepare the Teams app manifest: The manifest is a JSON file describing your Teams app ( the bot ). Microsoft provides a schema for this but you can download the manifest file from this example, make sure you replace the <APP ID> placeholders within it. The manifest file consists of: App ID: Use the Bot’s App ID ( Client ID of the registered app ) App name, description: The name of Teams app, example "SnapLogic Agent". Icons: Prepare two icon images for the bot – typically a color icon ( 192x192 PNG ) and an outline icon ( 32x32 PNG ). These will be used as the agent's avatar and in the Teams app catalog. The manifest may also include information like developer info, version number, etc. If using the Teams Developer Portal, it can guide you through filling these fields and will handle the JSON for you. Just ensure the Bot ID and scopes are correctly set. Combine manifest and icons: Once your manifest file and icons are ready, put all three into a .zip file. For example, a zip containing: manifest.json icon-color.png ( 192x192 ) icon-outline.png ( 32x32 ) Make sure the JSON inside the zip references the icon file names exactly as they are. Upload the app to Teams: In Microsoft Teams, go to Apps > Manage your apps > Upload a custom app. Upload the zip file. Teams should recognize it as a new app. When added, it essentially registers the bot ID with the Teams client. Test in Teams: Open a chat with your Weather Agent in Teams ( it should appear with the name and icon you provided ). Type a message, like "Hi" or a weather question: "What's the weather in New York?" The message will go out to Azure, which will call SnapLogic’s endpoint. The SnapLogic pipelines will run through the logic ( as described in in the first part ) and Azure will return the bot’s reply to Teams. You should see the bot’s answer appear in the chat. If you see the bot typing indicator first and then the answer, everything is working as expected! Initial message and a response from the agent Typing indicator as showcased during agent execution Agent response after using the available tools Now the Weather Agent is fully functional within Teams. It’s essentially an AI-powered chat interface to a live weather API, all orchestrated by SnapLogic in the background. Benefits of SnapLogic and Teams for Conversational Agentic Interfaces Integrating SnapLogic AgentCreator with Microsoft Teams via Azure Bot Service has several benefits: Fast prototyping: You can go from idea to a working bot in a very short time. There’s no need to write custom bot code or host a web service – SnapLogic pipelines become your bot logic. In our example, building a weather query bot is as simple as wiring up a few Snaps and APIs. This accelerates development and allows quick iteration. Business users or integration developers can prototype new AI agents rapidly, responding to evolving needs without a heavy software development cycle. No-code integration and simplicity: SnapLogic provides out-of-the-box connectors to hundreds of systems and services. By using SnapLogic as the engine, your bot can tap into any of these with minimal effort. Want a bot that not only gives weather but also looks up flight data or CRM info? It’s just another pipeline. The AgentCreator framework handles the AI part, while the SnapLogic platform handles the integration part ( connecting to external APIs and data sources ). This synergy makes it simple to create powerful bots that perform real actions – far beyond what an LLM alone could do. And it’s all done with low/no-code configuration. Enhanced user experience: Delivering automation through a conversational interface in Teams meets users where they already collaborate. There’s no new app to learn – users simply chat with a bot as if they’re chatting with a colleague. Reusability: The modular design of the pipelines in the weather agent can be a template for other agents by swapping out the tools and prompts. The integration pattern remains the same. This showcases the reusability of the AgentCreator approach across various use cases. Conclusion By combining SnapLogic’s generative AI integration capabilities with Microsoft’s bot framework and Teams, we created a powerful AI Agent without writing any code at all. We used SnapLogic AgentCreator snaps to handle the AI reasoning and tool calling, and used Azure Bot Service to connect that logic to a Microsoft Teams. The real win is how quickly and easily this was achieved. In a matter of days or even hours, an enterprise can prototype a conversational AI agent that ties into live data and services. The speed of development, combined with the secure and integration into everyday platforms like Teams, delivers real business value. In summary, SnapLogic and Teams enables a new class of enterprise applications: ones that talk to you, using AI to bridge human requests to automated actions. The Weather Agent is a simple example, but it highlights how fast prototyping, integration simplicity, and enhanced user experience come together. I encourage you to try building your own SnapLogic Agent – whether it’s for weather, workflows, or anything else – and unleash the power of conversational AI in your organization. Happy integrating, and don’t forget your umbrella if the Weather Agent says rain is on the way! AgentCreator718Views3likes0CommentsBuilding an AI Agent with SnapLogic AgentCreator using OpenAI and Microsoft Teams - Part 1
Building an AI Agent with SnapLogic AgentCreator using OpenAI and Microsoft Teams In this two-part blog series, I will cover how to create an AI Agent using SnapLogic's AgentCreator and integrate it with Microsoft Teams via Azure Bot services. The solution combines SnapLogic’s AgentCreator, OpenAI ( gpt 4.1 mini ) and Microsoft Teams ( to provide a familiar chat interface ). In the first part, we will cover building the agent with SnapLogic pipelines and the AgentCreator pattern. In the second part, I will explain the Azure setup and Teams integration, and highlight the business benefits of conversational automation. Designing the SnapLogic-Powered AI Agent The example I've decided to go with is a simple Weather Agent that provides a conversational interface for weather queries, accessible directly within Teams. This improves user experience by integrating information into the tools people already use and showcasing how SnapLogic’s AgentCreator can automate tasks through natural language. How does SnapLogic help? SnapLogic’s new AgentCreator framework allows us to build an AI-driven agent that uses LLM intelligence combined with SnapLogic pipelines to fetch real data. The Weather Agent understands a user’s question, decides if it needs to call a function ( tool ), performs that action via a SnapLogic pipeline, and then responds conversationally with the result. SnapLogic AgentCreator is purpose-built for such scenarios, enabling enterprises to create AI agents that can call pipelines and APIs autonomously. In our case, the agent will use a weather API through SnapLogic to get live data, meaning the agent's answers are not just based on static knowledge, but on real-time API calls. SnapLogic AgentCreator Architecture Overview We will focus on the AgentCreator pattern – a design that splits the agent’s logic into two cooperative pipelines: an Agent Driver and an Agent Worker. This pattern is orchestrated by SnapLogic’s Pipeline loop ( PipeLoop ) Snap, which allows iterative calls to a pipeline until a certain condition is met, in our case, until the conversation turn is complete or n number of iterations have been completed. Here’s how it works: Agent Driver pipeline: This orchestrator pipeline receives the incoming chat message and manages the overall conversation loop. It sends the user’s query ( plus any chat history messages available ) and the system prompt to the Agent Worker pipeline using the PipeLoop Snap, and keeps iterating until the LLM signals that it’s done responding or the number of iterations are reached. Agent Worker pipeline: This pipeline handles one iteration of LLM interaction. It presents the LLM with the conversation context and available tools, gets the LLM’s response ( which could be an answer or a function call request ), executes any required tool, and returns the result back to the Driver. The Worker is essentially where the "brain" of the agent lives – it decides if a tool call is needed and formats the answer. This architecture allows the agent to have multi-turn reasoning. For example, if the user asks for weather, the LLM might first respond with a function call to get data, the Worker executes that call, and then the LLM produces a final answer in a second iteration. The PipeLoop Snap in the Driver pipeline detects whether another iteration is needed ( if the last LLM output was a partial result or tool request ) and loops again, or stops if the answer is complete. Key components of the Weather Agent architecture: SnapLogic AgentCreator: The toolkit that makes this AI agent possible. It provides specialized Snaps for prompt handling, LLM integration ( OpenAI, Azure OpenAI, Amazon Bedrock, Google Gemini etc. ), and function-calling logic. SnapLogic AgentCreator enables designing AI agents with dynamic iteration and tool usage built in. LLM ( Generative AI model 😞 The LLM powering the agent's understanding and response generation. In our implementation, an LLM ( such as OpenAI GPT ) interprets the user’s request and decides when to call the available tools. SnapLogic’s Tool Calling Snap interfaces with the LLM’s API to get these decisions. Weather API: The external data source for live weather information. The agent uses a real API ( https://open-meteo.com/ ) to fetch current weather details for the requested location. Microsoft Teams & Azure Bot: This is the front-end interface where the user interacts with the bot, and the connector that sends messages between Teams and our SnapLogic pipelines. Setting up an OpenAI API account Because we are working with the gpt 4.1 mini API, we will need to configure an OpenAI account. This assumes you have already created an API key in your OpenAI dashboard. Navigate to Manager tab under your project folder location and click on the "+" button to create a new Account. Navigate to OpenAI LLM -> OpenAI API Key Account You can name it based on your needs or naming convention Copy and paste your API key from the OpenAI dashboard On the Agent Worker pipeline, open the "OpenAI Tool Calling" snap and apply the newly created account Save the pipeline. You have now successfully integrated the OpenAI API. Weather Agent pipelines in SnapLogic I've built a set of SnapLogic pipelines to implement the Weather Agent logic using AgentCreator. Each pipeline has a specific role in the overall chatbot workflow: WeatherAgent_AgentDriver: The orchestrator for the agent. It is triggered by incoming HTTP requests from the Azure Bot Service ( when a user sends a Teams message ). The AgentDriver parses the incoming message, sends a quick "typing…" indicator to the user ( to simulate the bot typing ), and then uses a PipeLoop Snap to invoke the AgentWorker pipeline. It supplies the user’s question, the system prompt and any prior context, and keeps iterating until the bot’s answer is complete. It also handles "deleting" chat history if the user writes a specific message like "CLEAR_CHAT" in the Teams agent conversation to refresh the conversation. WeatherAgent_AgentWorker: The tool-calling pipeline ( Agent Worker ) that interacts with the LLM. On each iteration, it takes the conversation messages ( system prompt, user query, and any accumulated dialogue history ) from the Driver. The flow of the Agent Worker for a Weather Agent: defines what tools ( functions ) the LLM is allowed to call – in this case, a location and weather lookup tools invokes the LLM via a Tool Calling Snap, passing it the current conversation and available function definitions processes the LLM’s response – if the LLM requests a function call ( "get weather for London" ), the pipeline routes that request to the appropriate tool pipeline once the tool returns data, the Worker formats the result using a Function Result Generator Snap and appends it to the conversation via a Message Appender Snap returns the updated conversation with any LLM answer or tool results back to the Driver. The AgentWorker essentially handles one round of "LLM thinking" WeatherAgent_GetLocation: A tool that the agent can use to convert a user’s input location ( city name, etc. ) into a standardized form or coordinates ( latitude and longitude ). It queries an open-meteo API to retrieve latitude and longitude data based on the given location. The system prompt instructs the agent that if the tool returns more than one match, ask the user which location they meant - keeping a human-in-the loop for such scenarios. For example, if the user requests weather for "Springfield", the agent first calls the GetLocation tool and if the tool responds with multiple locations, the agent will list them ( for example, Springfield, MA; Springfield, IL; Springfield, MO ) and ask the user to specify which location they meant before proceeding. Once the location is confirmed, the agent passes the coordinates to the GetWeather tool. WeatherAgent_GetWeather: The tool pipeline that actually fetches current weather data from an external API. This pipeline is invoked when the LLM decides it needs the weather info. It takes an input latitude and longitude and calls a weather API. In our case, I've used the open-meteo service, which returns a JSON containing weather details for a given location. The pipeline consists of an HTTP Client Snap ( configured to call the weather API endpoint with the location ) and a Mapper Snap to shape the API’s JSON response into the format expected by the Agent Worker pipeline. Once the data is retrieved ( temperature, conditions, etc. ), this pipeline’s output is fed back into the Agent Worker ( via the Function Result Generator ) so the LLM can use it to compose a user-friendly answer. MessageEndpoint_ChatHistory: This pipeline handles conversation history ( simple memory ) for each user or conversation. Because our agent may be used by multiple users ( and we want each user’s chat to be independent ), we maintain a user-specific chat history. In this example the pipeline uses the SLDB's storage to store the file but in a production environment the ChatHistory pipeline could use a database Snap store chat history, keyed by user or conversation ID. Each time a new message comes in, the AgentDriver will call this pipeline to fetch recent context ( so the bot can "remember" what was said before ). This ensures continuity in the conversation – for example, if the user follows up with "What about tomorrow?", the bot can refer to the previous question’s context stored in history. For simplicity, one could also maintain context in-memory during a single conversation session, but persisting it via this pipeline allows context across multiple sessions or a longer pause. SnapLogic introduced specialized Snaps for LLM function calling to coordinate this process. The Function Generator Snap defines the available tools that the LLM agent can use. The Tool Calling Snap sends the user’s query and function definitions to the LLM model and gets back either an answer or a function call request ( and any intermediate messages ). If a function call is requested, SnapLogic uses a Pipeline Execute or similar mechanism to run the corresponding pipeline. The Function Result Generator then formats the pipeline’s output into a form the LLM can understand. At the end, the Message Appender Snap adds the function result into the conversation history, so the LLM can take that into account in the next response. This chain of Snaps allows the agent to decide between answering directly or using a tool, all within a no-code pipeline. Sample interaction from user prompt to answer To make the above more concrete, let’s walk through the flow of a sample interaction step by step: User asks a question in Teams: "What's the weather in San Francisco right now?" This message is sent from Teams to the Azure Bot Service, which relays it as an HTTP POST to our SnapLogic AgentDriver pipeline’s endpoint ( the messaging endpoint URL we will configure in the second part ). AgentDriver pipeline receives the message: The WeatherAgent_AgentDriver captures the incoming JSON payload from Teams. This payload contains the user’s message text and metadata ( like user ID, conversation ID, etc. ). The pipeline will first respond immediately with a typing indicator to Teams. We configured a small branch in the pipeline to output a "typing" activity message back to the Bot service, so that Teams shows the bot is typing - implemented to mainly enhance UX while the user waits for an answer. Preparing the prompt and context: The AgentDriver then prepares the initial prompt for the LLM. Typically, we include a system prompt ( defining the bot’s role/behavior ) and the user prompt. If we have prior conversation history ( from MessageEndpoint_ChatHistory for this user ), we would also include recent messages to give context. All this is packaged into a messages array that will be sent to the LLM. AgentDriver invokes AgentWorker via PipeLoop: The Driver uses a PipeLoop Snap configured to call the WeatherAgent_AgentWorker pipeline. It passes the prepared message payload as input. The PipeLoop is set with a stop condition based on the LLM’s response status – it will loop until the LLM indicates the conversation turn is completed or the iteration limit has been reached ( for example, OpenAI returns a finish_reason of "stop" when it has a final answer, or "function_call" when it wants to call a function ). AgentWorker ( 1st iteration - tool decision 😞 In this first iteration, the Worker pipeline receives the messages ( system + user ). Inside the Worker: A Function Generator Snap provides the definition of the GetLocation and GetWeather tools, including their name, description, and parameters. This tells the LLM what the tool does and how to call it. The Tool Calling Snap now sends the conversation ( so far just the user question and system role ) along with the available tool definition to the LLM. The LLM evaluates the user’s request in the context of being a weather assistant. In our scenario, we expect it will decide it needs to use the tool to get the answer. Instead of replying with text, the LLM responds with a function call request. For example, the LLM might return a JSON like: The Tool Calling Snap outputs this structured decision. ( Under the hood, the Snap outputs it on a Tool Calls view when a function call is requested ). The pipeline splits into two parallel paths at this point: One path captures the LLM’s partial response ( which indicates a tool is being called ) and routes it into a Message Appender. This ensures that the conversation history now includes an assistant turn that is essentially a tool call. The other path takes the function call details and invokes the corresponding tool. In SnapLogic, we use a Pipeline Execute Snap to call the WeatherAgent_GetWeather pipeline. We pass the location ( "San Francisco" ) from the LLM’s request into that pipeline as input parameter ( careful, it is not a pipeline parameter ). WeatherAgent_GetWeather executes: This pipeline calls the external Weather API with the given location. It gets back a weather data ( say the API returns that it’s 18°C and sunny ). The SnapLogic pipeline returns this data to the AgentWorker pipeline. On the next iteration, the messages array would look something like below: AgentWorker ( function result return 😞 With the weather data now in hand, a Function Result Generator Snap in the Worker takes the result and packages it in the format the LLM expects for a function result. Essentially, it creates the content that will be injected into the conversation as the function’s return value. The Message Appender Snap then adds this result to the conversation history array as a new assistant message ( but marked in a way that the LLM knows it’s the function’s output ). Now the Worker’s first iteration ends, and it outputs the updated messages array ( which now contains: user’s question, assistant’s "thinking/confirmation" message, and the raw weather data from the tool ). AgentDriver ( loop decision 😞 The Driver pipeline receives the output of the Worker’s iteration. Since the last LLM action was a function call ( not a final answer ), the stop condition is not met. Thus, the PipeLoop triggers the next iteration, sending the updated conversation ( which now includes the weather info ) back into the AgentWorker for another round. AgentWorker ( 2nd iteration - final answer 😞 In this iteration, the Worker pipeline again calls the Tool Calling Snap, but now the messages array includes the results of the weather function. The LLM gets to see the weather data that was fetched. Typically, the LLM will now complete the task by formulating a human-friendly answer. For example, it might respond: "It’s currently 18°C and sunny in San Francisco." This time, the LLM’s answer is a normal completion with no further function calls needed. The Tool Calling Snap outputs the assistant’s answer text and a finish_reason indicating completion ( like "stop" ). The Worker appends this answer to the message history and outputs the final messages payload. AgentDriver ( completion 😞 The Driver receives the final output from the Worker’s second iteration. The PipeLoop Snap sees that the LLM signaled no more steps ( finish condition met ), so it stops looping. Now the AgentDriver takes the final assistant message ( the weather answer ) and sends it as the bot’s response back to Teams via the HTTP response. The pipeline will extract just the answer text to return to the user. User sees the answer in Teams: The user’s Teams chat now displays the Weather Agent's reply, for example: "It’s currently 18°C and sunny in San Francisco." The conversation context ( question and answer ) can be stored via the ChatHistory pipeline for future reference. From the user’s perspective, they asked a question and the bot answered naturally, with only a brief delay during which they saw the bot "typing" indicator. Throughout this interaction, the typing indicator that is implemented helps reassure the user that the agent is working on the request. The user-specific chat history ensures that if the user asks a follow-up like "How about tomorrow?", the agent could understand that "tomorrow" refers to the weather in San Francisco, continuing the context ( this would involve the LLM and pipelines using the stored history to know the city from prior turn ). This completes the first part, which was on how the SnapLogic pipelines and AgentCreator framework enable an AI-powered chatbot to use tools and deliver real-time info. We saw how the Agent Driver + Worker architecture ( with iterative PipeLoop execution ) allows interactions where the LLM can call SnapLogic pipelines as functions. The no-code SnapLogic approach made it possible to integrate an LLM without writing custom code – we simply configured Snaps and pipelines. We now have a working AI Agent that we can use in SnapLogic, however, we are still missing the chatbot experience. In the second part, we’ll shift to the integration with Microsoft Teams and Azure, to see how this pipeline is exposed as a bot endpoint and what steps are needed to deploy it in a real chat environment. AgentCreator607Views3likes0Comments