ai

30 Topics

Embeddings and Vector Databases
What are embeddings Embeddings are numerical representations of real-world objects, like text, images or audio. They are generated by machine learning models as vectors, an array of numbers, where the distance between vectors can be seens as the degree of similarity between objects. While an embedding model may have its own meaning for each of the dimensions, there’s no guarantee between embedding models of the meaning for each of the dimensions used by the embedding models. For example, the word “cat”, “dog” and “apple” might be embedded into the following vectors: cat -> (1, -1, 2) dog -> (1.5, -1.5, 1.8) apple -> (-1, 2, 0) These vectors are made-up for a simpler example. Real vectors are much larger, see the Dimension section for details. Visualizing these vectors as points in a 3D space, we can see that "cat" and "dog" are closer, while "apple" is positioned further away. Figure 1. Vectors as points in a 3D space By embedding words and contexts into vectors, we enable systems to assess how related two embedded items are to each other via vector comparison. Dimension of embeddings The dimension of embeddings refers to the length of the vector representing the object. In the previous example, we embedded each word into a 3-dimensional vector. However, a 3-dimensional embedding inevitably leads to a massive loss of information. In reality, word embeddings typically require hundreds or thousands of dimensions to capture the nuances of language. For example, OpenAI's text-embedding-ada-002 model outputs a 1536-dimensional vector Google Gemini's text-embedding-004 model outputs a 768-dimensional vector Amazon Titan's amazon.titan-embed-text-v2:0 model outputs a default 1024-dimensional vector Figure 2. Using text-embedding-ada-002 to embed the sentence “I have a calico cat.” In short, an embedding is a vector that represents a real-world object. The distance between these vectors indicates the similarity between the objects. Limitation of embedding models Embedding models are subject to a crucial limitation: the token limit, where a token can be a word, punctuation mark, or subword part. This constraint defines the maximum amount of text a model can process in a single input. For instance, the Amazon Titan Text Embeddings models can handle up to 8,192 tokens. When input text exceeds the limit, the model typically truncates it, discarding the remaining information. This can lead to a loss of context and diminished embedding quality, as crucial details might be omitted. To address this, several strategies can help mitigate its impact: Text Summarization or Chunking: Long texts can be summarized or divided into smaller, manageable chunks before embedding. Model Selection: Different embedding models have varying token limits. Choosing a model with a higher limit can accommodate longer inputs. What is a Vector Database Vector databases are optimized for storing embeddings, enabling fast retrieval and similarity search. By calculating the similarity between the query vector and the other vectors in the database, the system returns the vectors with the highest similarity, indicating the most relevant content. The following diagram illustrates a vector database search. A query vector 'favorite sport' is compared to a set of stored vectors, each representing a text phrase. The nearest neighbor, 'I like football', is returned as the top result. Figure 3. Vector Query Example Figure 4. Store Vectors into Database Figure 5. Retrieve Vectors from Database When working with vector databases, two key parameters come into play: Top K and similarity measure (or distance function). Top K When querying a vector database, the goal is often to retrieve the most similar items to a given query vector. This is where the Top K concept comes into play. Top K refers to retrieving the top K most similar items based on a similarity metric. For instance, if you're building a product recommendation system, you might want to find the top 10 products similar to the one a user is currently viewing. In this case, K would be 10. The vector database would return the 10 product vectors closest to the query product's vector. Similarity Measures To determine the similarity between vectors, various distance metrics are employed, including: Cosine Similarity: This measures the cosine of the angle between two vectors. It is often used for text-based applications as it captures semantic similarity well. A value closer to 1 indicates higher similarity. Euclidean Distance: This calculates the straight-line distance between two points in Euclidean space. It is sensitive to magnitude differences between vectors. Manhattan Distance: Also known as L1 distance, it calculates the sum of the absolute differences between corresponding elements of two vectors. It is less sensitive to outliers compared to Euclidean distance. Figure 6. Similarity Measures There are many other similarity measures not listed here. The choice of distance metric depends on the specific application and the nature of the data. It is recommended to experiment with various similarity metrics to see which one produces better results. What embedders are supported in SnapLogic As of October 2024, SnapLogic has supported embedders for major models and continues to expand its support. Supported embedders include: Amazon Titan Embedder OpenAI Embedder Azure OpenAi Embedder Google Gemini Embedder What vector databases are supported in SnapLogic Pinecone OpenSearch MongoDB Snowflake Postgres AlloyDB Pipeline examples Embed a text file Read the file using the File Reader snap. Convert the binary input to a document format using the Binary to Document snap, as all embedders require document input. Embed the document using your chosen embedder snap. Figure 7. Embed a File Figure 8. Output of the Embedder Snap Store a Vector Utilize the JSON Generator snap to simulate a document as input, containing the original text to be stored in the vector database. Vectorize the original text using the embedder snap. Employ a mapper snap to format the structure into the format required by Pinecone - the vector field is named "values", and the original text and other relevant data are placed in the "metadata" field. Store the data in the vector database using the vector database's upsert/insert snap. Figure 9. Store a Vector into Database Figure 10. A Vector in the Pinecone Database Retrieve Vectors Utilize the JSON Generator snap to simulate the text to be queried. Vectorize the original text using the embedder snap. Employ a mapper snap to format the structure into the format required by Pinecone, naming the query vector as "vector". Retrieve the top 1 vector, which is the nearest neighbor. Figure 11. Retrieve Vectors from a Database [ { "content" : "favorite sport" } ] Figure 12. Query Text Figure 13. All Vectors in the Database { "matches": [ { "id": "db873b4d-81d9-421c-9718-5a2c2bd9e720", "score": 0.547461033, "values": [], "metadata": { "content": "I like football." } } ] } Figure 14. Pipeline Output: the Closest Neighbor to the Query Embedder and vector databases are widely used in applications such as Retrieval Augmented Generation (RAG) and building chat assistants. Multimodal Embeddings While the focus thus far has been on text embeddings, the concept extends beyond words and sentences. Multimodal embeddings represent a powerful advancement, enabling the representation of various data types, such as images, audio, and video, within a unified vector space. By projecting different modalities into a shared semantic space, complex relationships and interactions between these data types can be explored. For instance, an image of a cat and the word "cat" might be positioned closely together in a multimodal embedding space, reflecting their semantic similarity. This capability opens up a vast array of possibilities, including image search with text queries, video content understanding, and advanced recommendation systems that consider multiple data modalities.
Luna
2 years ago Place SnapLogic Technical Blog
3.3KViews
5likes
0Comments
Building an AI Agent with SnapLogic AgentCreator using OpenAI and Microsoft Teams - Part 2
Integrating the AI Agent with Microsoft Teams via Azure Bot Service The first part covered the creation of our agent's architecture using SnapLogic pipelines and AgentCreator. Now, we focus on connecting that pipeline to Microsoft Teams so end users can chat with it. This involves creating and configuring the Azure Bot Service as a bridge between Teams and our SnapLogic pipelines. We will walk through the prerequisites and setup. Prerequisites for the Azure Bot Integration To integrate the SnapLogic agent with Teams, ensure you have the following prerequisites in place: SnapLogic AgentCreator and pipelines: A SnapLogic environment where AgentCreator is enabled. The Weather Agent pipelines can be used as a working example. You’ll also need to create the AgentDriver pipeline as a Triggered Task ( to obtain an endpoint URL accessible by the bot ). SnapLogic OAuth2 Account: An OAuth2 Account which will be used in an HTTP Client to send the assistant response back to the user. Also used for simulating "typing" indicator in Teams chat between tool usage. Microsoft 365 Tenant with Teams: Access to a Microsoft tenant where you have permission to register applications and upload custom Teams apps. You’ll need a Teams environment to test the bot ( this could be a corporate tenant or a developer tenant ). Azure Subscription: An Azure account with an active subscription to create resources ( specifically, an Azure Bot Service ). Also, ensure you have the Azure Bot Channels Registration or Azure Bot resource creation rights. Azure AD App Registration: Credentials for the bot. We will register an application in Azure Active Directory to represent our bot ( this provides a Client ID and Client Secret that will be used by the Bot Service to authenticate ). Azure Bot Service resource: We will create an Azure Bot which will tie together the app registration and our messaging endpoint, and allow adding Teams as a channel. Register an App in Azure AD for the Bot The first step is to register an Azure AD application that will identify our bot and provide authentication to Azure Bot Service and Teams. Create App registration: In the Azure Portal, navigate to Azure Active Directory > App Registrations and click "New registration". Give the app a name. For supported account types, you can choose "Accounts in this organizational directory only" ( Single tenant ) for simplicity, since this bot is intended for your organization’s Teams. You do not need to specify a Redirect URI for this scenario. Finalize registration: Click Register to create the app. Once created, you’ll see the Application ( Client ) ID – copy this ID, as we’ll need it later as the Bot ID and in the OAuth2 account. Create a client secret: In your new app’s overview, go to Certificates & secrets. Click "New client secret" to generate a secret key. Give it a description and a suitable expiration period. After saving, copy the Value of the client secret ( it will be a long string ). Save this secret somewhere secure now – you won’t be able to retrieve it again after you leave the page. We’ll provide this secret to the Bot Service so it can authenticate as this app and we will also use it in the OAuth2 account in SnapLogic. Gather Tenant ID: Since we chose a single-tenant app, we’ll also need the Azure AD tenant ID. You can find this in Overview of the app. Copy the tenant ID as well for later use. At this point, you should have: Client ID ( application ID ) for the bot and the OAuth2 account Client secret for the bot ( stored securely ) and the OAuth2 account Tenant ID of our Azure AD These will be used when setting up the Azure Bot Service so that it knows about this app registration. Create an OAuth2 account Now that we have the client id, client secret and tenant id all gathered from the app registration, we can create the OAuth2 account which will be used in an HTTP Client snap that will send the "typing" indicator as well as the response from the agent. Navigate to the "Manager" tab and locate your project folder where the agent pipelines are stored On the right side, click on the "+" icon to create a new account Choose "API Suite > OAuth2 Account" Populate the client id and client secret values from your app registration process Check the 'Send client data as Basic Auth header' and 'Header authenticated' settings Populate the authorization and token endpoints OAuth2 authorization endpoint: https://login.microsoftonline.com/<TENANT_ID>/oauth2/v2.0/authorize Oauth2 token endpoint: https://login.microsoftonline.com/<TENANT_ID>/oauth2/v2.0/token Change the "Grant type" to "client_credentials Add scope to both "Token endpoint config" and "Authorization endpoint config", the scope in our case is the following: https://api.botframework.com/.default Check "Auto-refresh token" Click "Authorize", if everything was set correctly on the previous step you should get redirected back to SnapLogic with a valid access token Example of an already configured OAuth2 account Create the Azure Bot Service and Connect to SnapLogic With the Azure AD app ready, we can create the actual bot resource that will connect to Teams and our SnapLogic endpoint: Add Azure Bot resource: In the Azure Portal, search for "Azure Bot" Service and select Azure Bot. Choose Create to make a new Bot resource. Configure Bot Settings: On the creation form, fill in: Bot handle: A unique name for your bot. Subscription and Resource Group: Select your Azure subscription and a resource group to contain the bot resource. Location: Pick a region. Pricing tier: Choose the Free tier if ( F0 ) – it’s more than sufficient for development and basic usage. Microsoft App ID: Here, reuse the existing App Registration we created. There should be an option to choose an existing app – provide the Client ID of the app registration. This links the bot resource to our AD app. App type: Select Single Tenant since our app registration is single-tenant. You might also need to provide the App secret ( Client Secret ) for the bot here during creation. Create the Bot: Click Review + create and then Create to provision the bot service. Azure will deploy the bot resource. Once completed, go to the resource’s page. Configure messaging endpoint: This is a crucial step – we must point the bot to our SnapLogic pipeline. In the Azure Bot resource settings, find the Settings menu and navigate to Configuration. Populate the field for Messaging endpoint ( the URL that the bot will call when a message is received ). Here, paste the Trigger URL of your WeatherAgent_AgentDriver pipeline. To get this URL: In SnapLogic, you would have already created the AgentDriver pipeline as a Triggered Task. That generates an endpoint URL: https://elastic.snaplogic.com/api/1/rest/slschedule/<org>/<proj>/WeatherAgent_AgentDriver Example endpoint with an appended authorization query param as bearer_token: https://elastic.snaplogic.com/api/1/rest/slschedule/myOrg/WeatherProject/WeatherAgent_AgentDriver?bearer_token=<bearer token> Enter the URL exactly as given by SnapLogic + Bearer token value Save the configuration. Now, when the Teams user messages the bot, Azure will send an HTTPS POST to this SnapLogic URL. Add Microsoft Teams channel: Still in the Azure Bot resource, go to Channels. Add a new channel and select Microsoft Teams. This step registers the bot with Teams so that Teams clients can use it. Now our bot service is set up with the SnapLogic pipeline as its backend. The AgentDriver pipeline is effectively the bot’s webhook. The Azure Bot resource handles authentication with Teams and will forward user messages to SnapLogic and relay SnapLogic’s responses back to Teams. Packaging the bot for Teams ( App manifest ) At this stage, the bot exists in Azure, but to use it in Teams we need to package it as a Teams app, especially if we want to share it within the organization. This involves creating a Teams app manifest and icons, then uploading it to Teams. Prepare the Teams app manifest: The manifest is a JSON file describing your Teams app ( the bot ). Microsoft provides a schema for this but you can download the manifest file from this example, make sure you replace the <APP ID> placeholders within it. The manifest file consists of: App ID: Use the Bot’s App ID ( Client ID of the registered app ) App name, description: The name of Teams app, example "SnapLogic Agent". Icons: Prepare two icon images for the bot – typically a color icon ( 192x192 PNG ) and an outline icon ( 32x32 PNG ). These will be used as the agent's avatar and in the Teams app catalog. The manifest may also include information like developer info, version number, etc. If using the Teams Developer Portal, it can guide you through filling these fields and will handle the JSON for you. Just ensure the Bot ID and scopes are correctly set. Combine manifest and icons: Once your manifest file and icons are ready, put all three into a .zip file. For example, a zip containing: manifest.json icon-color.png ( 192x192 ) icon-outline.png ( 32x32 ) Make sure the JSON inside the zip references the icon file names exactly as they are. Upload the app to Teams: In Microsoft Teams, go to Apps > Manage your apps > Upload a custom app. Upload the zip file. Teams should recognize it as a new app. When added, it essentially registers the bot ID with the Teams client. Test in Teams: Open a chat with your Weather Agent in Teams ( it should appear with the name and icon you provided ). Type a message, like "Hi" or a weather question: "What's the weather in New York?" The message will go out to Azure, which will call SnapLogic’s endpoint. The SnapLogic pipelines will run through the logic ( as described in in the first part ) and Azure will return the bot’s reply to Teams. You should see the bot’s answer appear in the chat. If you see the bot typing indicator first and then the answer, everything is working as expected! Initial message and a response from the agent Typing indicator as showcased during agent execution Agent response after using the available tools Now the Weather Agent is fully functional within Teams. It’s essentially an AI-powered chat interface to a live weather API, all orchestrated by SnapLogic in the background. Benefits of SnapLogic and Teams for Conversational Agentic Interfaces Integrating SnapLogic AgentCreator with Microsoft Teams via Azure Bot Service has several benefits: Fast prototyping: You can go from idea to a working bot in a very short time. There’s no need to write custom bot code or host a web service – SnapLogic pipelines become your bot logic. In our example, building a weather query bot is as simple as wiring up a few Snaps and APIs. This accelerates development and allows quick iteration. Business users or integration developers can prototype new AI agents rapidly, responding to evolving needs without a heavy software development cycle. No-code integration and simplicity: SnapLogic provides out-of-the-box connectors to hundreds of systems and services. By using SnapLogic as the engine, your bot can tap into any of these with minimal effort. Want a bot that not only gives weather but also looks up flight data or CRM info? It’s just another pipeline. The AgentCreator framework handles the AI part, while the SnapLogic platform handles the integration part ( connecting to external APIs and data sources ). This synergy makes it simple to create powerful bots that perform real actions – far beyond what an LLM alone could do. And it’s all done with low/no-code configuration. Enhanced user experience: Delivering automation through a conversational interface in Teams meets users where they already collaborate. There’s no new app to learn – users simply chat with a bot as if they’re chatting with a colleague. Reusability: The modular design of the pipelines in the weather agent can be a template for other agents by swapping out the tools and prompts. The integration pattern remains the same. This showcases the reusability of the AgentCreator approach across various use cases. Conclusion By combining SnapLogic’s generative AI integration capabilities with Microsoft’s bot framework and Teams, we created a powerful AI Agent without writing any code at all. We used SnapLogic AgentCreator snaps to handle the AI reasoning and tool calling, and used Azure Bot Service to connect that logic to a Microsoft Teams. The real win is how quickly and easily this was achieved. In a matter of days or even hours, an enterprise can prototype a conversational AI agent that ties into live data and services. The speed of development, combined with the secure and integration into everyday platforms like Teams, delivers real business value. In summary, SnapLogic and Teams enables a new class of enterprise applications: ones that talk to you, using AI to bridge human requests to automated actions. The Weather Agent is a simple example, but it highlights how fast prototyping, integration simplicity, and enhanced user experience come together. I encourage you to try building your own SnapLogic Agent – whether it’s for weather, workflows, or anything else – and unleash the power of conversational AI in your organization. Happy integrating, and don’t forget your umbrella if the Weather Agent says rain is on the way! AgentCreator
j_angelevski
6 months ago Place GenAI and AgentCreator
738Views
4likes
0Comments
A Comparison of Assistant and Non-Assistant Tool Calling Pipelines
Introduction At a high level, the logic behind assistant tool calling and non-assistant tool calling is fundamentally the same: the model instructs the user to call specific function(s) in order to answer the user's query. The user then executes the function and returns the result to the model, which uses it to generate an answer. This process is identical for both. However, since the assistant specifies the function definitions and access to tools as part of the Assistant configuration within the OpenAI or Azure OpenAI dashboard rather than within your pipelines, there will be major differences in the pipeline configuration. Additionally submitting tool responses to an Assistant comes with significant changes and challenges since the Assistant owns the conversational history rather than the pipeline. This article focuses on contrasting these differences. For a detailed understanding of assistant pipelines and non-assistant pipelines, please refer to the following article: Non-assistant pipelines: Introducing Tool Calling Snaps and LLM Agent Pipelines Assistant pipelines: Introducing Assistant Tool Calling Pipelines Part 1: Which System to Use: Non-Assistant or Assistant? When to Use Non-Assistant Tool Calling Pipelines: Non-Assistant Tool Calling Pipelines offer greater flexibility and control over the tool calling process, making them suitable for the following specific scenarios. When preferring a “run-time“ approach: Non-Assistant pipelines exhibit greater flexibility in function definition, offering a more "runtime" approach. You can dynamically adjust the available functions by simply adding or removing Function Generator snaps within the pipeline. In contrast, Assistant Tool Calling Pipelines necessitate a "design-time" approach. All available functions must be pre-defined within the Assistant configuration, requiring modifications to the Assistant definition in the OpenAI/Azure OpenAI dashboard. When wanting detailed chat history: Non-Assistant pipelines provide a comprehensive history of the interaction between the model and the tools in the output message list. The message list within the Non-Assistant pipeline preserves every model response and the results of each function execution. This detailed logging allows for thorough debugging, analysis, and auditing of the tool calling process. In contrast, Assistant pipelines maintain a more concise message history, focusing on key steps and omitting some intermediate details. While this can simplify the overall view of the message list, it can also make it more difficult to trace the exact sequence of events or diagnose issues that may arise during tool execution in child pipelines. When needing easier debugging and iterative development: Non-Assistant pipelines facilitate more granular debugging and iterative development. You can easily simulate individual steps of the agent by making calls to the model with specific function call histories. This allows for more precise control and experimentation during development, enabling you to isolate and address issues more effectively. For example, by providing three messages, we can "force" the model to call the second tool, allowing us to inspect the tool calling process and its result against our expectations. In contrast, debugging and iterating with Assistant pipelines can be more cumbersome. Since Assistants manage the conversation history internally, to simulate a specific step, you often need to replay the entire interaction from the beginning, potentially requiring multiple iterations to reach the desired state. This internal management of history makes it less straightforward to isolate and debug specific parts of the interaction. To simulate calling the third tool, we need to start a new thread from scratch and then call tool1 and tool2, repeating the preceding process. The current thread cannot be reused. When to Use Assistant Tool Calling Pipelines: Assistant Tool Calling Pipelines also offer a streamlined approach to integrating LLMs with external tools, prioritizing ease of use and built-in functionalities. Consider using Assistant pipelines in the following situations: For simplified pipeline design: Assistant pipelines reduce pipeline complexity by eliminating the need for Tool Generator snaps. In Non-Assistant pipelines, these snaps are essential for dynamically generating tool definitions within the pipeline itself. With Assistant pipelines, tool definitions are configured beforehand within the Assistant settings in the OpenAI/Azure OpenAI dashboard. This pre-configuration results in shorter, more manageable pipelines, simplifying development and maintenance. When leveraging built-in tools is required: If your use case requires functionalities like searching external files or executing code, Assistant pipelines offer these capabilities out-of-the-box through their built-in File Search and Code Interpreter tools (see Part 5 for more details). These tools provide a convenient and efficient way to extend the LLM's capabilities without requiring custom implementation within the pipeline. Part 2: A brief introduction to two pipelines Non-assistant tool calling pipelines Key points: Functions are defined in the worker. The worker pipeline's Tool Calling snap manages all model interactions. Function results are collected and sent to the model in the next iteration via the Tool Calling snap. Assistant tool calling pipelines Key points: No need to define functions in any pipeline. Functions are pre-defined in the assistant. Two snaps : interact with the model: Create and Run Thread, and Submit Tool Outputs. Function results are collected and sent to the model immediately during the current iteration. Part 3: Comparison between two pipelines Here are two primary reasons why the assistant and non-assistant pipelines differ, listed in decreasing order of importance: Distinct methods of submitting tool results: For non-assistant pipelines, tool results are appended to the message history list and subsequently forwarded to the model during the next iteration. Non-assistant pipelines exhibit a "while-loop" behavior, where the worker interacts with the model at the beginning of the iteration, and while any tools need to be called, the worker executes those tool(s). In contrast, for assistants, tool results are specifically sent to a dedicated endpoint designed to handle tool call results within the current iteration. The assistant pipelines operate more like a "do-while-loop." The driver initiates the interaction by sending the prompt to the model. Subsequently, the worker execute the tool(s) first and interacts with the model at the end of the iteration to deliver tool results. Predefined and stored tool definitions for assistants: Unlike non-assistant pipelines, assistants have the capability to predefine and store function definitions. This eliminates the need for the three Function Generator snaps to repeatedly transmit tool definitions to the model with each request. Consequently, the worker pipeline for assistants appears shorter. Due to the aforementioned differences, non-assistant pipelines have only one interaction point with the model, located in the worker. In contrast, assistant pipelines involve two interaction points: the driver sends the initial prompt to the model, while the worker sends tool results back to the model. Part 4: Differences in snap settings Stop condition of Pipeloop A key difference in snap settings lies in the stop condition of the pipeloop. Assistant pipeline’s stop condition: $run.required_action == null . Non-assistant pipeline’s stop condition: $finish_reason != "tool_calls" . Assistant’s output Example when tool calls are required: Example when tool calls are NOT required: Non-assistant’s output Example when tool calls are required: Example when tool calls are NOT required: Part 5: Assistant’s two built-in tools The assistant not only supports all functions that can be defined in non-assistant pipelines but also provides two special built-in functions, file search and code interpreter, for user convenience. If the model determines that either of these tools is required, it will automatically call and execute the tool within the assistant without requiring manual user intervention. You don't need a tool call pipeline to experiment with file search and code interpreter. A simple create and run thread snap is sufficient. File search File Search augments the Assistant with knowledge from outside its model, such as proprietary product information or documents provided by your users. OpenAI automatically parses and chunks your documents, creates and stores the embeddings, and use both vector and keyword search to retrieve relevant content to answer user queries. Example Prompt: What is the number of federal fires between 2018 and 2022? The assistant’s response is as below: The assistant’s response is correct. As the answer to the prompt is in the first row of a table on the first page of wildfire_stats.pdf, a document accessible to the assistant via a vector store. Answer to the prompt: The file is stored in a vector store used by the assistant: Code Interpreter Code Interpreter allows Assistants to write and run Python code in a sandboxed execution environment. This tool can process files with diverse data and formatting, and generate files with data and images of graphs. Code Interpreter allows your Assistant to run code iteratively to solve challenging code and math problems. When your Assistant writes code that fails to run, it can iterate on this code by attempting to run different code until the code execution succeeds. Example Prompt: Find the number of federal fires between 2018 and 2022 and use Matplotlib to draw a line chart. * Matplotlib is a python library for creating plots. The assistant’s response is as below: From the response, we can see that the assistant indicated it used file search to find 5 years of data and then generated an image file. This file can be downloaded from the assistant's dashboard under storage-files. Simply add a file extension like .png to see the image. Image file generated by assistant: Part 6: Key Differences Summarized Feature Non-Assistant Tool Calling Pipelines Assistant Tool Calling Pipelines Function Definition Defined within the worker pipeline using Function Generator snaps. Pre-defined and stored within the Assistant configuration in the OpenAI/Azure OpenAI dashboard. Tool Result Submission Appended to the message history and sent to the model in the next iteration. Sent to a dedicated endpoint within the current iteration. Model Interaction Points One (in the worker pipeline). Two (driver sends initial prompt, worker sends tool results). Built-in Tools None. File Search and Code Interpreter. Pipeline Complexity More complex pipeline structure due to function definition within the pipeline. Simpler pipeline structure as functions are defined externally.
Luna
10 months ago Place SnapLogic Technical Blog
810Views
4likes
0Comments
What is Retrieval-Augmented Generation (RAG)?
What is Retrieval-Augmented Generation (RAG)? Retrieval-Augmented Generation (RAG) is the process of enhancing the reference data used by language models (LLMs) through integrating them with traditional information retrieval systems. This hybrid approach allows LLMs to access and utilize external knowledge bases, databases, and other authoritative sources of information, thereby improving the accuracy, relevance, and currency of the generated responses without requiring extensive retraining. Without RAG, LLMs generate responses based on the information they were trained on. With RAG, the response generation process is enriched by integrating external information into the generation. How does Retrieval-Augmented Generation work? Retrieval-Augmented Generation works through bringing multiple systems or services to generate the prompt to the LLM. This means there will be required setup to support the different systems and services to feed the appropriate data for a RAG workflow. This involves several key steps: 1. External Data Source Creation: External data refers to information outside the original training data of the LLM. This data can come from a variety of sources such as APIs, databases, document repositories, and web pages. The data is pre-processed and converted into numerical representations (embeddings) using embedding models, and then stored in a searchable vector database along with reference to the data that was used to generate the embedding. This forms a knowledge library that can be used to augment a prompt when calling into the LLM for generation of a response to a given input. 2. Retrieval of Relevant Information: When a user inputs a query, it is embedded into a vector representation and matched against the entries in the vector database. The vector database retrieves the most relevant documents or data based on semantic similarity. For example, a query about company leave policies would retrieve both the general leave policy document and the specific role leave policies. 3. Augmentation of LLM Prompt: The retrieved information is then integrated into the prompt to send to the LLM using prompt engineering techniques. This fully formed prompt is sent to the LLM, providing additional context and relevant data that enables the model to generate more accurate and contextually appropriate responses. 4. Generation of Response: The LLM processes the augmented prompt and generates a response that is coherent, contextually appropriate, and enriched with accurate, up-to-date information. The following diagram illustrates the flow of data when using RAG with LLMs. Why use Retrieval-Augmented Generation? RAG addresses several inherent challenges of using LLMs by leveraging external data sources: 1. Enhanced Accuracy and Relevance: By accessing up-to-date and authoritative information, RAG ensures that the generated responses are accurate, specific, and relevant to the user's query. This is particularly important for applications requiring precise and current information, such as specific company details, release dates and release items, new features available for a product, individual product details, etc.. 2. Cost-Effective Implementation: RAG enables organizations to enhance the performance of LLMs without the need for expensive and time-consuming fine-tuning or custom model training. By incorporating external knowledge libraries, RAG provides a more efficient way to update and expand the model's basis of knowledge. 3. Improved User Trust: With RAG, responses can include citations or references to the original sources of information, increasing transparency and trust. Users can verify the source of the information, which enhances the credibility and trust of an AI system. 4. Greater Developer Control: Developers can easily update and manage the external knowledge sources used by the LLM, allowing for flexible adaptation to changing requirements or specific domain needs. This control includes the ability to restrict sensitive information retrieval and ensure the correctness of generated responses. Doing this in conjunction with an evaluation framework (link to evaluation pipeline article) can help to roll out newer content more rapidly to downstream consumers. Snaplogic GenAI App Builder: Building RAG with Ease Snaplogic GenAI App Builder empowers business users to create large language model (LLM) powered solutions without requiring any coding skills. This tool provides the fastest path to developing generative enterprise applications by leveraging services from industry leaders such as OpenAI, Azure OpenAI, Amazon Bedrock, Anthropic Claude on AWS, and Google Gemini. Users can effortlessly create LLM applications and workflows using this robust platform. With Snaplogic GenAI App Builder, you can construct both an indexing pipeline and a Retrieval-Augmented Generation (RAG) pipeline with minimal effort. Indexing Pipeline This pipeline is designed to store the contents of a PDF file into a knowledge library, making the content readily accessible for future use. Snaps used: File Reader, PDF Parser, Chunker, Amazon Titan Embedder, Mapper, OpenSearch Upsert. After running this pipeline, we would be able to view these vectors in OpenSearch. RAG Pipeline This pipeline enables the creation of a chatbot capable of answering questions based on the information stored in the knowledge library. Snap used: HTTP Router, Amazon Titan Embedder, Mapper, OpenSearch Query, Amazon Bedrock Prompt Generator, Anthropic Claude on AWS Messages. To implement these pipelines, the solution utilizes the Amazon Bedrock Snap Pack and the OpenSearch Snap Pack. However, users have the flexibility to employ other LLM and vector database Snaps to achieve similar functionality.
Shumin
2 years ago Place SnapLogic Technical Blog
1.3KViews
4likes
0Comments
Recipes for Success with SnapLogic’s GenAI App Builder: From Integration to Automation
For this episode of the Enterprise Alchemists podcast, Guy and Dominic invited Aaron Kesler and Roger Sramkoski to join them to discuss why SnapLogic's GenAI App Builder is the key to success with AI projects. Aaron is the Senior Product Manager for all things AI at SnapLogic, and Roger is a Senior Technical Product Marketing Manager focused on AI. We kept things concrete, discussing real-world results that early adopters have already been able to deliver by using SnapLogic's integration capabilities to power their new AI-driven experiences.
Dominic
2 years ago Place Enterprise Alchemists
2.3KViews
4likes
2Comments
GenAI App Builder Getting Started Series: Part 2 - Purchase Order Processing
👋 Welcome! Hello everyone and welcome to our second guide in the GenAI App Builder Getting Started Series! First things first, GenAI App Builder is now generally available for all customers to purchase or test in SnapLabs. If you are a customer or partner who wants access to SnapLabs, please reach out to your Customer Success Manager and they can grant you access. If you are not yet a customer, you can check out our GenAI App Builder videos then when you’re ready to take the next step, request a demo with our sales team! 🤔 What is GenAI App Builder? If you’re coming here from Part 1, you may notice that GenAI Builder is now GenAI App Builder. Thank you to our customers who shared feedback on how we could improve the name to better align with the purpose. The original name had led to some confusion that its purpose was to train LLMs. 📑 Purchase Order Processing Example In this example we will demonstrate how to use GenAI in a SnapLogic Pipeline to act like a function written in natural language to extract information from a PDF. The slide below shows an example of how we use natural language to extract the required fields in JSON format that would allow us to make this small pattern part of a larger app or data integration workflow. ✅ Prerequisites In order to following along with this guide, you will need the items below to complete this guide: Access to GenAI App Builder (in your company’s organization or in SnapLabs) Your own API account with access to Azure OpenAI, OpenAI, Amazon Bedrock Claude. ⬆️ Import the pipeline At the bottom of this post you will find several files if you want to just use a pattern to see this in action in your own environment and explore it further. If you are familiar with SnapLogic and want to build the Pipeline on your own you can do that as well and just download the example PDF or try your own! PurchaseOrderExample.pdf InvoiceProcessing_CommunityArticlePipeline_2024_06_28.slp (zipped) Once you are signed in to SnapLogic or SnapLabs you can start with the steps below to import the Pipeline: In Designer, click the icon shown in the screenshot below to import the Pipeline: Select the file in the File Browser window that pops up In the Add New Pipeline panel that opens you can change name and project location if desired Press the Save button in the lower-right corner 🚧 Parsing the file If you imported the pipeline using the steps above, then your pipeline should look like the one below. The steps below assume you imported the pipeline. If you are familiar enough with SnapLogic to build this on your own you can drag the Snaps shown below to create the Pipeline then follow along with us. 🔈 NOTE: The instructions here are completed with the Amazon Bedrock Prompt Generator and the Anthropic Claude on AWS for the last two Snaps in the Pipeline. You can swap these out for Azure OpenAI or OpenAI Snaps if you prefer to use those LLMs. Click the File Reader Snap to open its settings Click the icon at the far right of the File field as shown in the screenshot below Click the Upload File button in the upper-right corner of the window that pops up Select the PDF file from your file browser (download the file “” at the bottom of this post if you have not already) Save and close the File Reader Snap once your file is selected No edits are needed for the PDF Parser Snap, so we'll skip over that one Click the Mapper Snap Add $text in the Expression field and $context in the Target path fields as shown below Save and close the Mapper Snap Click on the fourth Snap, the Prompt Generator Snap (we will demonstrate here with the Amazon Bedrock Prompt Generator Snap - you do not have to use Amazon Bedrock though, you can any of the other LLM Prompt Generators we have like Azure OpenAI, OpenAI, etc.) Click the Edit Prompt button as shown in the screenshot below so we can modify the prompt used for the LLM You should see a pre-generated prompted like the one below: Copy the prompt below and replace the default prompt: Instruction: Your task is to pull out the company name, the date created, date shipped, invoice number, P.O. number, vendor from vendor details, recipient name from recipient details, subtotal, 'Shipping & handling', tax rate, sales tax, and total from the context below. Give the results back in JSON. Context: {{context}} The Prompt Generator text should now look like the screenshot below: Click the Ok button in the lower-right corner to save our prompt changes Click on the last Snap, the Chat Completions Snap (we will demonstrate here with the Anthropic Claude on AWS Chat Completions Snap - you do not have to use Anthropic Claude on AWS though, you can any of the other LLM Chat Completions Snaps we have like Azure OpenAI, OpenAI, etc.) Click the Account tab Click Add Account; if you have an existing LLM account to use you can select that here and skip to step 22 below Select the type of account you want then press Continue - available options will depend on which LLM Chat Completions Snap you chose Enter in the required credentials for the LLM account you chose; here is an example of the Amazon Bedrock Account Press the Apply button when done entering the credentials Verify your account is now selected in the Account tab Click on the Settings Click on the Suggest icon to the right of the Model name field as shown in the screenshot below and select the model you want to use Type $prompt in the Prompt field as shown in the screenshot below: Expand the Model Parameters section by clicking on it (if you are using OpenAI or Azure OpenAI, you can leave Maximum Tokens blank; for Anthropic Claude on AWS you will need to increase Maximum Tokens from 200 to something higher; you can see where we set 50,000 below) Save and close the Chat Completions Snap 🎬 Testing our example At this point we are ready to test our Pipeline and observe the results! The screenshot below shows you where you can click to Validate the Pipeline, which should have every Snap turn green with preview output as shown below. If you have any errors or questions, please reply to share them with us! Here is the JSON output after the Anthropic Claude on AWS Chat Completions Snap (note that other LLMs will have different API output structures): Extras! Want to play with this further? Try adding a Copy Snap after the Mapper and sending the file to multiple LLMs at once then review the results. Try changing {{context}} in the Prompt Generator to something else so you can drop the Mapper from the pipeline 🏁 Wrapping up Congratulations, you have now completed at least one GenAI App Builder integration in SnapLogic! 😎 Stay tuned to the SnapLabs channel here in the Integration Nation for more content on GenAI App Builder in the future! Please share any thoughts, comments, concerns, or feedback in a reply or DM RogerSramkoski!
RogerSramkoski
2 years ago Place SnapLabs
1.8KViews
4likes
0Comments
Building an AI Agent with SnapLogic AgentCreator using OpenAI and Microsoft Teams - Part 1
Building an AI Agent with SnapLogic AgentCreator using OpenAI and Microsoft Teams In this two-part blog series, I will cover how to create an AI Agent using SnapLogic's AgentCreator and integrate it with Microsoft Teams via Azure Bot services. The solution combines SnapLogic’s AgentCreator, OpenAI ( gpt 4.1 mini ) and Microsoft Teams ( to provide a familiar chat interface ). In the first part, we will cover building the agent with SnapLogic pipelines and the AgentCreator pattern. In the second part, I will explain the Azure setup and Teams integration, and highlight the business benefits of conversational automation. Designing the SnapLogic-Powered AI Agent The example I've decided to go with is a simple Weather Agent that provides a conversational interface for weather queries, accessible directly within Teams. This improves user experience by integrating information into the tools people already use and showcasing how SnapLogic’s AgentCreator can automate tasks through natural language. How does SnapLogic help? SnapLogic’s new AgentCreator framework allows us to build an AI-driven agent that uses LLM intelligence combined with SnapLogic pipelines to fetch real data. The Weather Agent understands a user’s question, decides if it needs to call a function ( tool ), performs that action via a SnapLogic pipeline, and then responds conversationally with the result. SnapLogic AgentCreator is purpose-built for such scenarios, enabling enterprises to create AI agents that can call pipelines and APIs autonomously. In our case, the agent will use a weather API through SnapLogic to get live data, meaning the agent's answers are not just based on static knowledge, but on real-time API calls. SnapLogic AgentCreator Architecture Overview We will focus on the AgentCreator pattern – a design that splits the agent’s logic into two cooperative pipelines: an Agent Driver and an Agent Worker. This pattern is orchestrated by SnapLogic’s Pipeline loop ( PipeLoop ) Snap, which allows iterative calls to a pipeline until a certain condition is met, in our case, until the conversation turn is complete or n number of iterations have been completed. Here’s how it works: Agent Driver pipeline: This orchestrator pipeline receives the incoming chat message and manages the overall conversation loop. It sends the user’s query ( plus any chat history messages available ) and the system prompt to the Agent Worker pipeline using the PipeLoop Snap, and keeps iterating until the LLM signals that it’s done responding or the number of iterations are reached. Agent Worker pipeline: This pipeline handles one iteration of LLM interaction. It presents the LLM with the conversation context and available tools, gets the LLM’s response ( which could be an answer or a function call request ), executes any required tool, and returns the result back to the Driver. The Worker is essentially where the "brain" of the agent lives – it decides if a tool call is needed and formats the answer. This architecture allows the agent to have multi-turn reasoning. For example, if the user asks for weather, the LLM might first respond with a function call to get data, the Worker executes that call, and then the LLM produces a final answer in a second iteration. The PipeLoop Snap in the Driver pipeline detects whether another iteration is needed ( if the last LLM output was a partial result or tool request ) and loops again, or stops if the answer is complete. Key components of the Weather Agent architecture: SnapLogic AgentCreator: The toolkit that makes this AI agent possible. It provides specialized Snaps for prompt handling, LLM integration ( OpenAI, Azure OpenAI, Amazon Bedrock, Google Gemini etc. ), and function-calling logic. SnapLogic AgentCreator enables designing AI agents with dynamic iteration and tool usage built in. LLM ( Generative AI model 😞 The LLM powering the agent's understanding and response generation. In our implementation, an LLM ( such as OpenAI GPT ) interprets the user’s request and decides when to call the available tools. SnapLogic’s Tool Calling Snap interfaces with the LLM’s API to get these decisions. Weather API: The external data source for live weather information. The agent uses a real API ( https://open-meteo.com/ ) to fetch current weather details for the requested location. Microsoft Teams & Azure Bot: This is the front-end interface where the user interacts with the bot, and the connector that sends messages between Teams and our SnapLogic pipelines. Setting up an OpenAI API account Because we are working with the gpt 4.1 mini API, we will need to configure an OpenAI account. This assumes you have already created an API key in your OpenAI dashboard. Navigate to Manager tab under your project folder location and click on the "+" button to create a new Account. Navigate to OpenAI LLM -> OpenAI API Key Account You can name it based on your needs or naming convention Copy and paste your API key from the OpenAI dashboard On the Agent Worker pipeline, open the "OpenAI Tool Calling" snap and apply the newly created account Save the pipeline. You have now successfully integrated the OpenAI API. Weather Agent pipelines in SnapLogic I've built a set of SnapLogic pipelines to implement the Weather Agent logic using AgentCreator. Each pipeline has a specific role in the overall chatbot workflow: WeatherAgent_AgentDriver: The orchestrator for the agent. It is triggered by incoming HTTP requests from the Azure Bot Service ( when a user sends a Teams message ). The AgentDriver parses the incoming message, sends a quick "typing…" indicator to the user ( to simulate the bot typing ), and then uses a PipeLoop Snap to invoke the AgentWorker pipeline. It supplies the user’s question, the system prompt and any prior context, and keeps iterating until the bot’s answer is complete. It also handles "deleting" chat history if the user writes a specific message like "CLEAR_CHAT" in the Teams agent conversation to refresh the conversation. WeatherAgent_AgentWorker: The tool-calling pipeline ( Agent Worker ) that interacts with the LLM. On each iteration, it takes the conversation messages ( system prompt, user query, and any accumulated dialogue history ) from the Driver. The flow of the Agent Worker for a Weather Agent: defines what tools ( functions ) the LLM is allowed to call – in this case, a location and weather lookup tools invokes the LLM via a Tool Calling Snap, passing it the current conversation and available function definitions processes the LLM’s response – if the LLM requests a function call ( "get weather for London" ), the pipeline routes that request to the appropriate tool pipeline once the tool returns data, the Worker formats the result using a Function Result Generator Snap and appends it to the conversation via a Message Appender Snap returns the updated conversation with any LLM answer or tool results back to the Driver. The AgentWorker essentially handles one round of "LLM thinking" WeatherAgent_GetLocation: A tool that the agent can use to convert a user’s input location ( city name, etc. ) into a standardized form or coordinates ( latitude and longitude ). It queries an open-meteo API to retrieve latitude and longitude data based on the given location. The system prompt instructs the agent that if the tool returns more than one match, ask the user which location they meant - keeping a human-in-the loop for such scenarios. For example, if the user requests weather for "Springfield", the agent first calls the GetLocation tool and if the tool responds with multiple locations, the agent will list them ( for example, Springfield, MA; Springfield, IL; Springfield, MO ) and ask the user to specify which location they meant before proceeding. Once the location is confirmed, the agent passes the coordinates to the GetWeather tool. WeatherAgent_GetWeather: The tool pipeline that actually fetches current weather data from an external API. This pipeline is invoked when the LLM decides it needs the weather info. It takes an input latitude and longitude and calls a weather API. In our case, I've used the open-meteo service, which returns a JSON containing weather details for a given location. The pipeline consists of an HTTP Client Snap ( configured to call the weather API endpoint with the location ) and a Mapper Snap to shape the API’s JSON response into the format expected by the Agent Worker pipeline. Once the data is retrieved ( temperature, conditions, etc. ), this pipeline’s output is fed back into the Agent Worker ( via the Function Result Generator ) so the LLM can use it to compose a user-friendly answer. MessageEndpoint_ChatHistory: This pipeline handles conversation history ( simple memory ) for each user or conversation. Because our agent may be used by multiple users ( and we want each user’s chat to be independent ), we maintain a user-specific chat history. In this example the pipeline uses the SLDB's storage to store the file but in a production environment the ChatHistory pipeline could use a database Snap store chat history, keyed by user or conversation ID. Each time a new message comes in, the AgentDriver will call this pipeline to fetch recent context ( so the bot can "remember" what was said before ). This ensures continuity in the conversation – for example, if the user follows up with "What about tomorrow?", the bot can refer to the previous question’s context stored in history. For simplicity, one could also maintain context in-memory during a single conversation session, but persisting it via this pipeline allows context across multiple sessions or a longer pause. SnapLogic introduced specialized Snaps for LLM function calling to coordinate this process. The Function Generator Snap defines the available tools that the LLM agent can use. The Tool Calling Snap sends the user’s query and function definitions to the LLM model and gets back either an answer or a function call request ( and any intermediate messages ). If a function call is requested, SnapLogic uses a Pipeline Execute or similar mechanism to run the corresponding pipeline. The Function Result Generator then formats the pipeline’s output into a form the LLM can understand. At the end, the Message Appender Snap adds the function result into the conversation history, so the LLM can take that into account in the next response. This chain of Snaps allows the agent to decide between answering directly or using a tool, all within a no-code pipeline. Sample interaction from user prompt to answer To make the above more concrete, let’s walk through the flow of a sample interaction step by step: User asks a question in Teams: "What's the weather in San Francisco right now?" This message is sent from Teams to the Azure Bot Service, which relays it as an HTTP POST to our SnapLogic AgentDriver pipeline’s endpoint ( the messaging endpoint URL we will configure in the second part ). AgentDriver pipeline receives the message: The WeatherAgent_AgentDriver captures the incoming JSON payload from Teams. This payload contains the user’s message text and metadata ( like user ID, conversation ID, etc. ). The pipeline will first respond immediately with a typing indicator to Teams. We configured a small branch in the pipeline to output a "typing" activity message back to the Bot service, so that Teams shows the bot is typing - implemented to mainly enhance UX while the user waits for an answer. Preparing the prompt and context: The AgentDriver then prepares the initial prompt for the LLM. Typically, we include a system prompt ( defining the bot’s role/behavior ) and the user prompt. If we have prior conversation history ( from MessageEndpoint_ChatHistory for this user ), we would also include recent messages to give context. All this is packaged into a messages array that will be sent to the LLM. AgentDriver invokes AgentWorker via PipeLoop: The Driver uses a PipeLoop Snap configured to call the WeatherAgent_AgentWorker pipeline. It passes the prepared message payload as input. The PipeLoop is set with a stop condition based on the LLM’s response status – it will loop until the LLM indicates the conversation turn is completed or the iteration limit has been reached ( for example, OpenAI returns a finish_reason of "stop" when it has a final answer, or "function_call" when it wants to call a function ). AgentWorker ( 1st iteration - tool decision 😞 In this first iteration, the Worker pipeline receives the messages ( system + user ). Inside the Worker: A Function Generator Snap provides the definition of the GetLocation and GetWeather tools, including their name, description, and parameters. This tells the LLM what the tool does and how to call it. The Tool Calling Snap now sends the conversation ( so far just the user question and system role ) along with the available tool definition to the LLM. The LLM evaluates the user’s request in the context of being a weather assistant. In our scenario, we expect it will decide it needs to use the tool to get the answer. Instead of replying with text, the LLM responds with a function call request. For example, the LLM might return a JSON like: The Tool Calling Snap outputs this structured decision. ( Under the hood, the Snap outputs it on a Tool Calls view when a function call is requested ). The pipeline splits into two parallel paths at this point: One path captures the LLM’s partial response ( which indicates a tool is being called ) and routes it into a Message Appender. This ensures that the conversation history now includes an assistant turn that is essentially a tool call. The other path takes the function call details and invokes the corresponding tool. In SnapLogic, we use a Pipeline Execute Snap to call the WeatherAgent_GetWeather pipeline. We pass the location ( "San Francisco" ) from the LLM’s request into that pipeline as input parameter ( careful, it is not a pipeline parameter ). WeatherAgent_GetWeather executes: This pipeline calls the external Weather API with the given location. It gets back a weather data ( say the API returns that it’s 18°C and sunny ). The SnapLogic pipeline returns this data to the AgentWorker pipeline. On the next iteration, the messages array would look something like below: AgentWorker ( function result return 😞 With the weather data now in hand, a Function Result Generator Snap in the Worker takes the result and packages it in the format the LLM expects for a function result. Essentially, it creates the content that will be injected into the conversation as the function’s return value. The Message Appender Snap then adds this result to the conversation history array as a new assistant message ( but marked in a way that the LLM knows it’s the function’s output ). Now the Worker’s first iteration ends, and it outputs the updated messages array ( which now contains: user’s question, assistant’s "thinking/confirmation" message, and the raw weather data from the tool ). AgentDriver ( loop decision 😞 The Driver pipeline receives the output of the Worker’s iteration. Since the last LLM action was a function call ( not a final answer ), the stop condition is not met. Thus, the PipeLoop triggers the next iteration, sending the updated conversation ( which now includes the weather info ) back into the AgentWorker for another round. AgentWorker ( 2nd iteration - final answer 😞 In this iteration, the Worker pipeline again calls the Tool Calling Snap, but now the messages array includes the results of the weather function. The LLM gets to see the weather data that was fetched. Typically, the LLM will now complete the task by formulating a human-friendly answer. For example, it might respond: "It’s currently 18°C and sunny in San Francisco." This time, the LLM’s answer is a normal completion with no further function calls needed. The Tool Calling Snap outputs the assistant’s answer text and a finish_reason indicating completion ( like "stop" ). The Worker appends this answer to the message history and outputs the final messages payload. AgentDriver ( completion 😞 The Driver receives the final output from the Worker’s second iteration. The PipeLoop Snap sees that the LLM signaled no more steps ( finish condition met ), so it stops looping. Now the AgentDriver takes the final assistant message ( the weather answer ) and sends it as the bot’s response back to Teams via the HTTP response. The pipeline will extract just the answer text to return to the user. User sees the answer in Teams: The user’s Teams chat now displays the Weather Agent's reply, for example: "It’s currently 18°C and sunny in San Francisco." The conversation context ( question and answer ) can be stored via the ChatHistory pipeline for future reference. From the user’s perspective, they asked a question and the bot answered naturally, with only a brief delay during which they saw the bot "typing" indicator. Throughout this interaction, the typing indicator that is implemented helps reassure the user that the agent is working on the request. The user-specific chat history ensures that if the user asks a follow-up like "How about tomorrow?", the agent could understand that "tomorrow" refers to the weather in San Francisco, continuing the context ( this would involve the LLM and pipelines using the stored history to know the city from prior turn ). This completes the first part, which was on how the SnapLogic pipelines and AgentCreator framework enable an AI-powered chatbot to use tools and deliver real-time info. We saw how the Agent Driver + Worker architecture ( with iterative PipeLoop execution ) allows interactions where the LLM can call SnapLogic pipelines as functions. The no-code SnapLogic approach made it possible to integrate an LLM without writing custom code – we simply configured Snaps and pipelines. We now have a working AI Agent that we can use in SnapLogic, however, we are still missing the chatbot experience. In the second part, we’ll shift to the integration with Microsoft Teams and Azure, to see how this pipeline is exposed as a bot endpoint and what steps are needed to deploy it in a real chat environment. AgentCreator
j_angelevski
6 months ago Place GenAI and AgentCreator
643Views
3likes
0Comments
Best Practices for Adopting AI Solutions in the Enterprise with SnapLogic AgentCreator
Best Practices for Adopting AI Solutions in the Enterprise with SnapLogic AgentCreator Version: 1.2 Authors: Dominic Wellington, Guy Murphy, Pat Traynor, Bash Badawi, Ram Bysani, David Dellsperger, Aaron Kesler Introduction: AI in the Modern Enterprise AI is fast becoming a cornerstone of modern enterprises, transforming how businesses operate, make decisions, and interact with customers. Its capabilities, such as automation, predictive analytics, and natural language processing, allow companies to streamline processes, gain deeper insights from data, and enhance customer experiences. From optimizing supply chains to personalizing marketing strategies, AI is enabling enterprises to innovate, drive efficiency, and be competitive in an increasingly data-driven world. As AI continues to evolve, its role in shaping business strategy and operations will only grow. Precisely because of its novelty and importance, leaders will need to think carefully about various aspects of how these powerful new capabilities can be deployed in a manner that is compliant with existing legislation and regulation, and how best to integrate them with existing systems and processes. There are no generally-accepted best practices in this field yet due to its novelty, but there are lessons that we can learn from past waves of technological change and adoption. In this document we set out some suggestions for how to think about these topics in order to ensure a positive outcome. Data Data is the lifeblood of IT — arguably the reason for the field’s entire existence — but its importance is only magnified when it comes to AI. Securing access to data is a requirement for an AI project to get off the ground in the first place, but managing that access over time, especially as both the data and the policies that apply to it change and evolve over time. Data Security and Management Data security has always been a complex issue for IT organizations. When considered from the perspective of AI adoption, two main areas need to be considered: external and internal usage. Externally-hosted Large Language Models (LLMs) offer powerful and rapidly-evolving capabilities, but also have inherent risks as they are operated by third parties. The second area of focus is how and what internal data should be used with AI models, whether self-managed or externally operated. External Security Organizations have reasonable concerns about their proprietary, regulated, or otherwise sensitive information “leaking” beyond the organizational boundaries. For this reason, simply sending internal information to a public LLM without any controls in place is not considered a viable solution. An approach to this problem that was previously considered promising was to use a technique called Retrieval Augmented Generation, or RAG. In this approach, rather than passing user queries directly to an LLM for answers, a specialized data store is deployed, called a vector database. When a user query is received, the vector data store is consulted first to identify relevant chunks of information with which to answer the query, and only after this step is the LLM used to provide the conversational response back to the user. However, while RAG does limit the potential for information leakage, it does not reduce it to zero. The vector database can be operated according to the organization’s own risk profile: fully in-house, as a private cloud instance, or leveraging a shared platform, depending on the information it contains and the policies or regulations that apply to that information. However, a chunk of information will be sent from the vector store to the LLM to answer each query, and over time, this process can be expected to expose a substantial part of the knowledge base to the LLM. It is also important to be aware that the chunking process itself still uses a LLM. More security-sensitive organizations or those operating in regulated industries may choose to leverage a more restricted deployment model for the LLM as well, much as discussed for the vector database itself, in order to avoid this leakage. However, it is worth noting that while an “open-source” language model can be prevented from contributing training data back to its developers, its own pre-existing training data may still leak out into the answers. The ultimate risk here is of “model poisoning” from open-source models. That is, injection of data from outside the user’s domain which may lead to inconsistent or undesirable responses. One example of this phenomenon is “context collapse”, which may occur in the case of overloaded acronyms, where the same acronym can represent vastly different concepts in different domains. A generalist model may mis-understand or mis-represent the acronym — or worse, may do so inconsistently. The only way to be entirely certain of data security and hygiene is to train the model from scratch — an undertaking that, due to its cost in both time and resources, is practical only for the largest organisations, and is anyway required only for the most sensitive data sets. A halfway house that is suitable for organisations that have concerns in this domain, but not to the point of being willing to engineer everything themselves from the ground up, is fine-tuning. In this approach, a pre-trained model is further trained on a specific data set. This is a form of transfer learning where a pre-trained model trained on a large dataset is adapted to work for a specific task. The dataset required for this sort of fine-tuning is very small compared to the dataset required for full model training, bringing this approach within reach of far more organisations. Internal Data Access Controls The data that is consumed by the AI model also needs to be secured inside the organization, ensuring that access controls on that data follow the data through the system at all levels. It is all too easy to focus on ingesting the data and forget about the metadata, such as role-based access controls. Instead, these controls should be maintained throughout the AI-enabled system. Any role-based access controls (RBAC) that are placed on the input data should also be reflected in the output data. Agentic approaches are useful here, as they give the opportunity to enforce such controls at various points. The baseline should be that, if a user ought not be able to access certain information through traditional means such as database queries or direct filesystem access, they also must not be able to access it by querying an AI overlay over those systems — and vice-versa, of course. Prompt Logging and Observability An emerging area of concern is the security of prompts used with AI models. Especially when using public or unmodified open-source models, the primary input to the models is the prompt that is passed to them. Even minor changes to that prompt can cause major differences in what is returned by the model. For this reason, baseline best practice is to ensure that prompts are backed up and versioned, just as would be done for more traditional program code. In addition, both prompts and their corresponding responses should be logged in order to be able to identify and troubleshoot issues such as performance changes or impact to pricing models of public LLMs. Some more detailed suggestions are available here. Prompts should also be secured against unauthorized modification, or “prompt injection”. Similarly to the analogous “SQL injection”, attackers may attempt to modify or replace the prompt before it is passed to the AI model, in order to produce outputs that are different from those expected and desired by users and operators of the system. The potential for damage increases further in the case of agentic systems that may chain multiple model prompts together, and potentially even take actions in response to those prompts. Again, logging for both in-the-moment observability and later audit is important here, including the actual final prompt that was sent to the model, especially when that has been assembled across multiple steps. These logs are useful for troubleshooting, but may also be formally required for demonstrating compliance with regulation or legislation. Example Prompt Injection Scenarios Direct Injection An attacker injects a prompt into a customer support chatbot, instructing it to ignore previous guidelines, query private data stores, and send emails, leading to unauthorized access and privilege escalation. Indirect Injection A user employs an LLM to summarize a webpage containing hidden instructions that cause the LLM to insert an image linking to a URL, leading to exfiltration of the private conversation. Unintentional Injection A company includes an instruction in a job description to identify AI-generated applications. An applicant, unaware of this instruction, uses an LLM to optimize their resume, inadvertently triggering the AI detection. Intentional Model Influence An attacker modifies a document in a repository used by a Retrieval-Augmented Generation (RAG) application. When a user’s query returns the modified content, the malicious instructions alter the LLM’s output, generating misleading results. Code Injection An attacker exploits a vulnerability in an LLM-powered email assistant to inject malicious commands, allowing access to sensitive information and manipulation of email content. Payload Splitting An attacker uploads a resume with split malicious prompts. When an LLM is used to evaluate the candidate, the combined prompts manipulate the model’s response, resulting in a positive recommendation regardless of the resume’s actual contents. Multimodal Injection An attacker embeds a malicious prompt within an image that accompanies benign text. When a multimodal AI processes the image and text concurrently, the hidden prompt alters the model’s behavior, potentially leading to unauthorized actions or disclosure of sensitive information. Adversarial Suffix An attacker appends a seemingly meaningless string of characters to a prompt, which influences the LLM’s output in a malicious way, bypassing safety measures. Multilingual/Obfuscated Attack An attacker uses multiple languages or encodes malicious instructions (e.g., using Base64 or emojis) to evade filters and manipulate the LLM’s behavior. reference: https://genai.owasp.org/llmrisk/llm01-prompt-injection/ As these examples show, there are many patterns and return sets from LLMs that will need to be managed and observed, comparing prompts, responses, and data sets with certified sets and expected structures. Hopefully over time and as commercial LLMs mature, many of these issues will be managed by the LLMs themselves, but today these concerns will have to be part of the enterprise’s own governance framework for AI adoption. Data Ownership and Observability Much of the value of most Generative AI (GenAI) applications is based on the quantity, freshness and reliability of the source data that is provided. An otherwise fully-functional GenAI tool that provides responses based on incomplete or out-of-date data will not be useful or valuable to its users. The first question is simply how to gain access to useful source data, and to maintain that access in the future. This work spans both technical and policy aspects. SnapLogic of course makes technical connectivity easy, but there may still be questions of ownership and compliance, not to mention identifying where necessary data even resides. Beyond the initial setup of the AI-enabled system, it will be important to maintain ongoing access to up-to-date data. For instance, if a RAG approach is used, the vector data store will need to be refreshed periodically from the transactional data platform. The frequency of such updates will vary between use cases, depending on the nature of the data and its natural rate of change. For instance, a list of frequently asked questions, or FAQs, can be updated whenever a new entry is added to the list. Meanwhile, a data set that is updated in real time, such as airline operations, will need much more frequent synchronization if it is to remain useful. Recommendations Data is key to the success of AI-enabled systems – and not just one-time access to a dataset that is a point-in-time snapshot, but ongoing access to real-time data. Fortunately, these are not new concerns, and existing tools and techniques can be applied readily to securing and managing that flow of data. In fact, the prominence and urgency of AI projects can even facilitate the broad deployment of such tools and techniques, where they had previously been relegated to specialised domains of data and analytics. It is important to note that as the SnapLogic platform facilitates connectivity and movement of data, none of the patterns of such movement are used to enrich the learnings of a LLM. In other words, pipelines act to transport encrypted data from source to destination without any discernment of the actual payload. No payload data and no business logic governing the movement of data are ever gleaned from such movement or used to train any models. In fact, the SnapLogic platform can be used to enhance data security at source and destination as highlighted above, adding guardrails to an AI system to enforce policies against publication of sensitive or otherwise restricted data. In general it is recommended for domain experts, technical practitioners, and other stakeholders to work together and analyze each proposed use case for AI, avoiding both reflexive refusals and blind enthusiasm, focusing instead on business benefit and how to achieve that in a specific regulatory or policy context. Auditability and Data Lineage The ability to audit the output of AI models is a critical requirement, whether for routine debugging, or in response to incoming regulation (e.g. the EU AI Act) that may require auditability for the deployment of AI technology in certain sectors or for particular use cases. For instance, use of AI models for decision support in legal cases or regulated industries, especially concerning health and welfare, may be subject to legal challenges, leading to requests to audit particular responses that were generated by the model. Commercial and legal concerns may also apply when it comes to use cases that may impinge on IP protection law. Complete forensic auditability of the sort that is provided by traditional software is not possible for LLMs, due to their non-deterministic nature. For this reason, deterministic systems may still be preferable in certain highly-regulated spaces, purely to satisfy this demand. However, a weaker definition of auditability is becoming accepted when it comes to LLMs, where both inputs and outputs are preserved, and the model is required to provide the source information used to generate that output. The source data is considered important both to evaluate the factual correctness of the answer, and also to identify any bias which may make its way into the model from its source data. These factors make auditability and data lineage critical part of the overall AI Strategy, which will have to be applied at various different stages of the solution lifecycle; Model creation and training - This aspect relates to how the model was created, and whether the data sets used to train the model have a risk of either skewing over time, or exposing proprietary information used during model development. Model selection - The precise version of AI model that was used to generate a response will need to be tracked, as even different versions of the same model may produce different responses to the same prompt. For this reason it is important to document the moment of any change in order to be able to track and debug any drift in response or behaviour. For external third-party AI services, these models may need to be tested and profiled as part of both an initial selection process and ongoing validation The reality is that there are is no single AI that is best for all use cases. Experience from real-world deployment of actual AI projects shows that some models are noticeably better for some functions than others, as well as having sometimes radically different cost profiles. Thiese factors means that most probably several different AI models will be used across an enterprise, and even (as agents) to satisfy a single use case. Prompt engineering - Unlike in traditional software development, by its very nature, prompt engineering includes the data sets and structures in the development cycle in a way that traditional functional coding practices do not. The models’ responses are less predictable, so understanding how the data will be processed is an integral part of the prompt engineering lifecycle. To understand how and why a set of prompts are put into production will be driven by both the desired functionality and the data that will be provided, in order to be able to review these inputs if issues arise in the production environment. Prompt evaluation in production - All critical systems today should have robust logging and auditing processes. In reality however many enterprises rarely achieve universal deployment, and coverage is often inconsistent. Due to the nature of AI systems, and notably LLMs, there will be a critical need to audit the data inputs and outputs. The model is effectively a black box that is not available for operators to reconstruct exactly why a given response was provided. This issue is especially critical when multiple AI models, agents, and systems are chained together or networked within a wider process. Logging the precise inputs that are sent to the model will be key for all these purposes. For custom-trained models, these requirements may also extend to the training data — although this case is presumed to remain relatively rare for the foreseeable future, given the prohibitive costs of performing such training. Where more common approaches (RAG, fine-training) are used that do not require an entire model to be trained from scratch, the audit would naturally focus on the inputs to the model and how those are managed. In both these cases, good information hygiene should be maintained, including preservation of historical data for point-in-time auditability. Backing up the data inputs is necessary but not sufficient: after all, a different LLM (or a subsequent version of the same LLM) may provide different responses based on the same prompt and data set. Therefore, if a self-trained LLM is employed, that model should also be backed up in the same way as the data that feeds it. If a public LLM is used, rigorous documentation should be maintained identifying any changes or version upgrades to the external model. All of this work is in addition to the tracking of the prompts and data inputs themselves, as described previously. All of these backups will in turn need to be preserved according to whatever evidentiary concerns are expected to apply. In the case of simple technical audits to ensure continuous improvements and avoid downward pressure on the quality of responses provided, organizations can make their own determination on the level of detail, the width of the time window to be preserved, and the granularity of the data. In more highly regulated scenarios, some or all of these elements may be mandated by outside parties. In those situations, the recommendation would also generally be to specify the backup policy defensively, to avoid any negative impacts in the case of future challenges. Development and Architecture Best Practices While AI systems have notable differences from earlier systems, they are still founded in large part on pre-existing components and techniques, and many existing best practices will still apply, if suitably modified and updated. CI/CD Continuous Integration and Continuous Deployment is of course not specific to Generative AI. However, as GenAI projects move from demo to production, and then evolve over subsequent releases, it becomes necessary to consider them as part of that process. Many components of a GenAI application are stateful, and the relationship between them can also be complex. A roll-back of a vector data store used to support a RAG application may have unforeseen effects if the LLM powering that RAG application remains at a different point of the configuration timeline. Therefore the different components of an AI-enabled system should be considered as tightly coupled for development purposes, as otherwise the GenAI component risks never becoming a fully-fledged part of the wider application environment. In particular, all of the traditional CI/CD concepts should apply also to the GenAI component: Continuous Development Continuous Testing Continuous Integration Continuous Deployment Continuous Monitoring Ensuring the inclusion of development teams in the process is unlikely to be a problem, as the field of AI is still evolving at breakneck pace. However, some of the later stages of an application’s lifecycle are often not part of the worldview of the developers of early demo AI applications, and so may be overlooked in the initial phases of productization of GenAI functionality. All of these phases also have specific aspects that should be considered when it comes to their application to GenAI, so they cannot simply be integrated into existing processes, systems, or modes of thought. New aspects of development and DevOps are needed to support, notable prompt engineering will have to be treated as code artifacts, but will also have to be associated with prompts such as model version, test data sets and samples of return data so that consistent functional management of the combined set of capabilities can be understood and tracked over time. QA and Testing Quality Assurance (QA) and testing strategies for AI projects, particularly GenAI, must address challenges that differ significantly from traditional IT projects. Unlike traditional systems where output is deterministic and follows predefined rules, GenAI systems are probabilistic and rely on complex models trained on vast datasets. A robust QA strategy for GenAI must incorporate dynamic testing of outputs for quality, coherence, and appropriateness across a variety of scenarios. This involves employing both automated testing frameworks and human evaluators to assess the AI's ability to understand prompts and generate contextually accurate responses, while also mitigating risks such as bias, misinformation, or harmful outputs. A GenAI testing framework should include unique approaches like model evaluation using synthetic and real-world data, stress testing for edge cases, and adversarial testing to uncover vulnerabilities such as the attack scenarios listed above. Frameworks such as CI/CD are essential but need to be adapted to accommodate iterative model training and retraining processes. Tools like Explainable AI (XAI) help provide transparency into model decisions, aiding in debugging and improving user trust. Additionally, feedback loops from production environments become vital in fine-tuning the model, enabling ongoing improvement based on real-world performance metrics rather than static, pre-defined test cases. However, depending on the use case and the data provided, such fine-tuning based on user behaviour may itself be sensitive and need to be managed with care. The QA process for GenAI also emphasizes ethical considerations and regulatory compliance more prominently than traditional IT projects. Testing needs to go beyond technical correctness to assess social impact, ensuring that the system avoids perpetuating harmful bias or misinformation. Continuous monitoring after deployment is crucial, as model performance can degrade over time due to shifting data distributions. This contrasts with traditional IT projects, where testing is often a finite phase before deployment. In GenAI, QA is an evolving, lifecycle-long endeavor requiring multidisciplinary collaboration among data scientists, ethicists, domain experts, and software engineers to address the complex, dynamic nature of generative models. Grounding, as an example, is a technique that can be used to help produce model responses that are more trustworthy, helpful, and factual. Grounding generative AI model responses means connecting them to verifiable sources of information. To implement groundin, usually means retrieving relevant source data. The recommended best practice is to use the retrieval-augmented generation (RAG) technique. Other test concepts include; Human-in-the-Loop Testing: Involves human evaluators judging the quality, relevance, and appropriateness of the model's outputs. Source data accuracy with details and data. Adversarial Testing: Actively trying to "break" the model by feeding it carefully crafted inputs designed to expose weaknesses and vulnerabilities Deployment This aspect might superficially be considered among the easiest to cover, but that may well not be the case. Most CI/CD pipelines are heavily automated; can the GenAI aspects be integrated easily into that flow? Some of the processes involved have long durations, e.g. chunking a new batch of information; can they be executed as part of a deployment, or do they need to be pre-staged so that the result can simply be copied into the production environment during a wider deployment action? Monitoring Ongoing monitoring of the performance of the system will also need to be considered. For some metrics, such as query performance or resource utilization, it is simply a question of ensuring that coverage also spans to the new GenAI experience. Other new metrics may also be required that are specific to GenAI, such as users’ satisfaction with the results they receive. Any sudden change in that metric, especially if correlated with a previous deployment of a change to the GenAI components, is grounds for investigation. While extensive best practices exist for the identification of technical metrics to monitor, these new metrics are still very much emergent, and each organization should consider carefully what information is required — or is likely to be required in the event of a future investigation or incident response scenario. Integration of AI with other systems and applications AI strategies are already moving beyond pure analytic or chatbot use case, as the agentic trend continues to develop. These services, whether home grown or hosted by third parties, will need to interface with other IT systems, most notably business processes, and this integration will need to be well considered to be successful. Today LLMs are producing return sets in seconds, and though the models are getting quicker, there is a trend to trade time for greater resilience of quality. How this trade-off is integrated into high performance business systems that operate many orders of magnitude faster will need to be considered and managed with care. Finally, as stated throughout this paper, AI’s non deterministic nature will mandate a focus on compensating patterns across the blend of AI and process systems. Recommendation While it is true that the specifics of AI-enabled systems differ from previous application architectures, general themes should still be carried over, whether by analogy, applying the spirit of the techniques to a new domain, or by ensuring that the more traditional infrastructural components of the AI application are managed with the same rigour as they would be in other contexts. Service / Tool Catalog The shift from content and chatbot experiences to agentic approaches imply a new fundamental architectural consideration of which functions and services should be accessible for use by the model. In the public domain there are simple patterns and models will mainly be operating with other public services and sources — but in the enterprise context, the environment will be more complex. Some examples of questions that a mature enterprise will need to address to maximize the potential of an agentic capability ; What are the “right” services? Large enterprises have hundreds, if not thousands, of services today, all of which are (or should be)managed according to what their business context is. A service management catalog will be key to manage many of these issues,as this will give a consistent point of entry to the service plane. Here again, pre-existing API management capabilities can ensure that the right access and control policies can be applied to support the adoption of composable AI-enabled applications and agents. When it comes to security profiling of a consuming LLM, the requester of the LLM service will have a certain level of access based on a combination of user role and security policy that is enforced. The model will have to pass on this to core systems at run time so that there are no internal data breaches. When it comes to agentic systems, new questions arise, beyond the simpler ones that apply to generative or conversational applications. For instance, should an agent be able to change a record? How much change should be allowed and how will this be tracked? Regulatory Compliance While the field of GenAI-enabled applications is still extremely new, best practices are beginning to emerge, such as those provided by the Open Web Application Security Project (OWASP). These cybersecurity recommendations are of course not guaranteed to cover any particular emerging regulation, but should be considered a good baseline which is almost certain to give a solid foundation from which to work to achieve compliance with national or sector-specific regulation and legislation as it is formalised. In general, it is recommended to ensure that any existing controls on systems and data sets, including RBAC and audit logs, are extended to new GenAI systems as well. Any changes to the new components — model version upgrades, changes to prompts, updates to training data sets, and more — will need to be documented and tracked with the same rigour as established approaches would mandate for traditional infrastructure changes. The points made previously about observability and auditability all contribute to achieving that foundational level of best-practice compliance. It is worth reiterating here that full coverage is expected to be an important difference between GenAI and previous domains. Compliance is likely to go far beyond the technical systems and their configurations, which were previously sufficient, and to require tracking of final prompts as supplied to models, including user input and runtime data. Conclusion Planning and managing the deployment and adoption of novel AI-enabled applications will require new policies and expertise to be developed. New regulation is already being created in various jurisdictions to apply to this new domain, and more is sure to be added in coming months and years. However, much as AI systems require access to existing data and integration with existing systems to deliver value at scale, existing policies, experience, and best practices can be leveraged to ensure success. For this reason, it is important to treat AI as an integral part of strategy, and not its own isolated domain, or worse, delegated to individual groups or departments without central IT oversight or support. By engaging proactively with users’ needs and business cases, IT leaders will have a much better chance of achieving measurable success and true competitive advantage with these new technologies — and avoiding the potential downsides: legal consequences of non-compliance, embarrassing public failures of the system, or simply incorrect responses being generated and acted upon by employees or customers.
GuyM
10 months ago Place Sigma Framework Library
1.1KViews
3likes
0Comments
LLM response logging for analytics
Why do we need LLM Observability? GenAI applications are great, they answer like how a human does. But how do you know if GPT isn’t being “too creative” to you when results from the LLM shows “Company finances are facing issues due to insufficient sun coverage”? As the scope of GenAI apps broaden, the vulnerability expands, and since LLM outputs are non-deterministic, a setup that once worked isn’t guaranteed to always work. Here’s an example of comparing the reasons why an LLM prompt fails vs why a RAG application fails. What could go wrong in the configuration? LLM prompts Suboptimal model parameters Temperature too high / tokens too small Uninformative System prompts RAG Indexing The data wasn’t chunked with the right size, information is sparse yet the window is small. Wrong distance was used. Used Euclidean distance instead of cosine Dimension was too small / too large Retrieval Top K too big, too much irrelevant context fetched Top K too small, not enough relevant context to generate result Filter misused And everything in LLM Prompts Although observability does not magically solve all problems, it gives us a good chance to figure out what might have gone wrong. LLM Observability provides methodologies to help developers better understand LLM applications, model performances, biases, and can help resolve issues before they reach the end users. What are common issues and how observability helps? Observability helps understanding in many ways, from performance bottlenecks to error detection, security and debugging. Here’s a list of common questions we might ask ourselves and how observability may come in handy. How long does it take to generate an answer? Monitor LLM response times and database query times helps identify potential bottlenecks of the application. Is the context retrieved from the Vector Database relevant? Logging database query and results retrieved helps identify better performing queries. Can assist on chunk size configuration based on retrieved results. How many tokens are used in a call? Monitor token usage can help determine the cost of each LLM call. How much better/worse is my new configuration setup doing? Parameter monitoring and response logging helps compare the performance of different models and model configurations. How is the GenAI application performing overall? Tracing stages of the application and evaluation helps identify the performance of the application What are users asking? Logging and analyzing user prompts help understand user needs and can help evaluate if optimizations can be introduced to reduce costs. Helps identify security vulnerabilities by monitoring malicious attempts and help proactively respond to mitigate threats. What should be tracked? GenAI applications involve components chained together. Depending on the use case, there are events and input/output parameters that we want to capture and analyze. A list of components to consider: Vector Database metadata Vector dimension: The vector dimension used to in the vector database Distance function: The way two vectors are compared in the vector database Vector Indexing parameters Chunk configuration: How a chunk is configured, including the size of the chunk, the unit of chunks, etc. This affects information density in a chunk. Vector Query parameters Query: The query used to retrieve context from the Vector Database Top K: The maximum number of vectors to retrieve from the Vector Database Prompt templates System prompt: The prompt to be used throughout the application Prompt Template: The template used to construct a prompt. Prompts work differently in different models and LLM providers LLM request metadata Prompt: The input sent to the LLM model from each end-user, combined with the template Model name: The LLM model used for generation, which affects the capability of the application Tokens: The number of tokens limit for a single request Temperature: The parameter for setting the creativity and randomness of the model Top P: The range of selection of words, the smaller the value the narrower the word selection is sampled from. LLM response metadata Tokens: The number of tokens used in input and output generation, affects costs Request details: May include information such as guardrails, id of the request, etc. Execution Metrics Execution time: Time taken to process individual requests Pipeline examples Logging a Chat completions pipeline We're using MongoDB to store model parameters and LLM responses as JSON documents for easy processing. Logging a RAG pipeline In this case, we're storing parameters to the RAG system (Agent Retrieve in this case) and the model. We're using JSON Generator Snaps to parameterize all input parameters to the RAG system and the LLM models. We then concat the response from the Vector Database, LLM model, and the parameters we provided for the requests.
tfan
2 years ago Place SnapLogic Technical Blog
1.3KViews
3likes
1Comment
GenAI App Builder Getting Started Series: Part 1 - HR Q&A example
👋 Welcome! Hello everyone, and welcome to our technical guide to getting started with GenAI App Builder on SnapLogic! At the time of publishing, GenAI App Builder is available for testing and will be generally available in our February release. For existing customers and partners, you can request access for testing GenAI App Builder by speaking to your Customer Success Manager or other member of your account team. If you're not yet a customer, you can speak to your Sales team about testing GenAI App Builder. 🤔 What is GenAI App Builder? Before we begin, let's take a moment to understand what GenAI App Builder is and at least a high-level talk about the components. GenAI App Builder is the latest offering in SnapLogic AI portfolio, focused on helping modern enterprises create applications with Generative AI faster, using a low-/no-code interface. That feels like a mouthful of buzzwords, so let me paint a picture (skip this if you're familiar with GenAI or watch our video, "Enabling employee and customer self-service"). Imagine yourself as a member of an HR team responsible for recruiting year round. Every new employee has an enrollment period after or just before their start date, and every existing employee has open enrollment once per year. During this time, employees need to choose between different medical insurance offerings, which usually involves a comparison of deductibles, networks, max-out-of-pocket, and other related features and limits. As you're thinking about all of this material, sorting out how to explain it all to your employees, you're interrupted by your Slack or Teams DM noise. Bing bong! Questions start flooding in: Hi, I'm a new employee and I'm wondering, when do I get paid? What happens payday is on a weekend or holiday? Speaking of holidays, what are company-recognized holidays this year? Hi, my financial account said I should change my insurance plan to one with an HSA. Can you help me figure out which plan(s) include an HSA and confirm the maximum contribution limits for a family this year? Hi, how does vacation accrual work? When does vacation rollover? Is unpaid vacation paid out or lost? All these questions and many others are answered in documents the HR team manages, including the employee handbook, insurance comparison charts, disability insurance sheets, life insurance sheets, other data sheets, etc. What if, instead of you having to answer all these questions, you would leverage a human-sounding large language model (LLM) to field these questions for you by making sure they referenced only the source documents you provide, so you don't have to worry about hallucinations?! Enter GenAI Builder! 🏗 Building an HR Q&A example Once you have access to test GenAI App Builder, you can use the following steps to start building out an HR Q&A example that will answer questions using only the employee handbook or whichever document that you provide. In this guide we will cover the two pipelines used, one that loads data and one that we will use to answer questions. We will not get into Snap customization or Snap details with this guide - it is just meant to show a quick use case. We do assume that you are familiar enough with SnapLogic to create a new Pipeline or import and existing one, search for Snaps, connect Snaps, and a few other simple steps. We will walk you through anything that is new to SnapLogic or that needs some additional context. We also assume you have some familiarity with Generative AI in this guide. We will also make a video with similar content in the near future, so I'll update or reply to this post once that content is available. Prerequisites In order to complete this guide, you will need the items below regardless of whether or not you use the Community-supported chatbot UI from SnapLogic. Access to a Pinecone instance (sign up for a free account at https://www.pinecone.io) with an existing index Access to Azure OpenAI or OpenAI You need a file to load, such as your company's employee handbook Loading data Our first step is to load data into the vector database using a Pipeline similar to the one below, which we will call the "Indexer" Pipeline since it helps populate the Pinecone Index. If you cannot find the patterns in the Pattern Library, you can find it attached below as "Indexer_Feb2024.slp". The steps below assume you have already imported the Pipeline or are building it as we go through. To add more color here, loading data into the vector database is only something that needs to be done when the files are updated. In the HR scenario, this might be once a year for open enrollment documents and maybe a few times a year for the employee handbook. We will explore some other use cases in the future where document updates would be much frequent. Click on the "File Reader" Snap to open its settings Click on the icon at the far right of the "File" field as shown in the screenshot below Click the "Upload" button in the upper-right corner of the window that pops up Select the PDF file from your local system that you want to index (we are using an employee handbook and you're welcome to do the same) to upload it, then make sure it is selected Save and close the "File Reader" Snap once your file is selected Leave the "PDF Parser" Snap with default settings Click on the "Chunker" Snap to open it, then mirror the settings in the screenshot below. Now open the "Azure OpenAI Embedder" or "OpenAI Embedder" Snap (you may need to replace the embedder that came in the Pattern or import with the appropriate one you have an account with). Go to the "Account" tab and create a new account for the embedder you're using. You need to replace the variables {YOUR_ACCOUNT_LABEL} with a label for the account that makes sense for you, then replace {YOUR_ENDPOINT} with the appropriate snippet from your Azure OpenAI endpoint. Validate the account if you can to make sure it works. After you save your new account you can go back to the main "Settings" tab on the Snap. If the account setup was successful, you should now be able to click the chat bubble icon at the far right of the "Deployment ID" field to suggest a "Deployment ID" - in our environment shown in the screenshot below, you can see we have one named "Jump-emb-ada-002" which I can now select. Finally, make sure the "Text to embed" field is set as shown below, then save and close this Snap. Now open the "Mapper" Snap so we can map the output of the embedder Snap to the "Pinecone Upsert" Snap as shown in the screenshot below. If it is difficult to see the mappings in the screenshot above, here is a zoomed in version: For a little more context here, we're mapping the $embedding object coming out of the embedder Snap to the $values object in Pinecone, which is required. If that was all you mapped though, your Q&A example would always reply with something like "I don't know" because there is no data. To do that, we need to make use of the very flexible "metadata" object in Pinecone by mapping $original.chunk to $metadata.chunk. We also statically set $metadata.source to "Employee Handbook.pdf" which allows the retriever Pipeline to return the source file used in answering a question (in a real-world scenario, you would probably determine the source dynamically/programmatically such as using the filename so this pipeline could load other files too). Save and close the "Mapper" Snap Finally, open the "Pinecone Upsert" Snap then click the "Account" tab and create a new account with your Pinecone API Key and validate it to make sure it works before saving Back on the main "Settings" tab of the "Pinecone Upsert" Snap, you can now click on the chat bubble icon to suggest existing indexes in Pinecone. For example, in our screenshot below you can see we have four which have been obscured and one named "se-demo." Indexes cannot be created on the fly, so you will have to make sure the index is created in the Pinecone web interface. The last setting we'll talk about for the Indexer pipeline is the "Namespace" field in the "Pinecone Upsert" Snap. Setting a namespace is optional. Namespaces in Pinecone create a logical separation between vectors within an index and can be created on-the-fly during Pipeline execution. For example, you could create an index like "2024_enrollment" for all documents published in 2024 for open enrollment and another called "2024_employeehandbook" to separate those documents into separate namespaces. Although these can be used just for internal purposes of organization, you can also direct a chatbot to only use one namespace to answer questions. We'll talk about this more in the "Answering Questions" section below which covers the Retriever Pipeline. Save and close the "Pinecone Upsert" Snap You should now be able to validate the entire Pipeline to see what the data looks like as it flows through the Snaps, and when you're ready to commit the data to Pinecone, you can Execute the Pipeline. Answering Questions To answer questions using the data we just loaded into Pinecone, we're going to recreate or import the Retriever Pipeline (attached as "Retriever_Feb2024.slp"). If you import the Pipeline you may need to add additional "Mapper" Snaps as shown below. We will walk through that in the steps below, just know this is what we'll end up with at the end of our first article. The screenshot above shows what the pattern will look like when you import it. Since this first part of the series will only take us up to the point of testing in SnapLogic, our first few steps will involve some changes with that in mind. Right-click on the "HTTP Router" Snap, click "Disable Snap" Click the circle between "HTTP Router" and embedder Snap to disconnect them Drag the "HTTP Router" Snap somewhere out of the way on the canvas (you can also delete it if you're comfortable replacing it later); your Pipeline should now look like this: In the asset palette on the left, search for the "JSON Generator" (it should appear before you finish typing that all out): Drag a "JSON Generator" onto the canvas, connecting it to the "Azure OpenAI Embedder" or "OpenAI Embedder" Snap Click on the "JSON Generator" to open it, then click on the "Edit JSON" button in the main Settings tab Highlight all the text from the template and delete it so we have a clean slate to work with Paste in this text, replacing "Your question here." with an actual question you want to ask that can be answered from the document you loaded with your Indexer Pipeline. For example, I loaded an employee handbook and I will ask the question, "When do I get paid?" [ { "prompt" : "Your question here." } ] Your "JSON Generator" should now look something like this but with your question: Click "OK" in the lower-right corner to save the prompt Click no the "Azure OpenAI Embedder" or "OpenAI Embedder" Snap to view its settings Click on the Account tab, then use the drop-down box to select the account you created in the section above ("Loading Data", steps 8-9) Click on the chat bubble icon to suggest "Deployment IDs" and choose the same one you chose in "Loading Data", step 10 Set the "Text to embed" field to $prompt as shown in the screenshot below: Save and close the "Azure OpenAI Embedder" or "OpenAI Embedder" Snap Click on the Mapper immediately after the embedder Snap Create a mapping for $embedding that maps to $vector Check the "Pass through" box; this Mapper Snap should now look like this: Save and close this "Mapper" Open the "Pinecone Query" Snap Click the Account tab, then use the drop-down to select the Pinecone account you created in "Loading Data", step 14 Use the chat bubble on the right side of the "Index name" field to select your existing Index [OPTIONAL] Use the chat bubble on the right side of the "Namespace" field to select your existing Namespace, if you created one; the "Pinecone Query" Snap should now look like this: Save and close the "Pinecone Query" Snap. Click on the "Mapper" Snap after the "Pinecone Query" Snap. In this "Mapper" we need to map the three items listed below, which are also shown in the following screenshot. If you're not familiar with the $original JSON key, it occurs when an upstream Snap has implicit pass through, or like the "Mapper" in step 17, we explicitly enable pass through, allowing us to access the original JSON document that went into the upstream Snap. (NOTE: If you're validating your pipeline along the way or making use of our Dynamic Validation, you may notice that no Target Schema shows up in this Mapper until after you complete steps 27-30.) Map $original.original.prompt to $prompt Map jsonPath($, "$matches[*].metadata.chunk") to jsonPath($, "$context[*].data") Map jsonPath($, "$matches[*].metadata.source") to jsonPath($, "$context[*].source") Save and close that "Mapper". Click on the "Azure OpenAI Prompt Generator" or "OpenAI Prompt Generator" so we can set our prompt. Click on the "Edit prompt" button and make sure your default prompt looks like the screenshot below. On lines 4-6 you can see we are using mustache templating like {{#context}} {{source}} {{/context}} which is the same as the jsonPath($, "$context[*].source") from the "Mapper" in step 25 above. We'll talk about this more in future articles - for now, just know this will be a way for you customize the prompt and data included in the future. Click "OK" in the lower-right corner Save and close the prompt generator Snap Click on the "Azure OpenAI Chat Completions" or "OpenAI Chat Completions" Snap Click the "Account" tab then use the drop-down box to select the account you created earlier Click the chat bubble icon to the far right of the "Deployment ID" field to suggest a deployment; this ID may be different than the one you've chosen in previous "Azure OpenAI" or "OpenAI" Snaps since we're selecting an LLM this team instead of an embedding model Set the "Prompt" field to $prompt; your Snap should look something like this: Save and close the chat completions Snap Testing our example Now it's time to validate our pipeline and take a look at the output! Once validated the Pipeline should look something like this: If you click the preview data output on the last Snap, the chat completions Snap, you should see output that looks like this: The answer to our prompt is under $choices[0].message.content. For the test above, I asked the question "When do I get paid?" against an employee handbook and the answer was this: Employees are paid on a semi-monthly basis (24 pay periods per year), with payday on the 15th and the last day of the month. If a regular payday falls on a Company-recognized holiday or on a weekend, paychecks will be distributed the preceding business day. The related context is retrieved from the following sources: [Employee Handbook.pdf] Wrapping up Stay tuned for further articles in the "GenAI App Builder Getting Started Series" for more use cases, closer looks at individual Snaps and their settings, and even how to connect a chat interface! Most if not all of these articles will also have an associated video if you learn better that way! If you have issues with the setup, find a missing step or detail, please reply to this thread to let us know!
RogerSramkoski
2 years ago Place SnapLabs
4.1KViews
3likes
1Comment