tfan
Employee
Employee

        We all love the Pipeline Execute Snap, it greatly simplifies a complex pipeline by extracting sections into a sub-pipeline. But sometimes, we’d really want the ability to run a pipeline multiple times to perform some operations, like polling from an endpoint or performing LLM Tool calls. In this article, we will introduce the PipeLoop Snap, which adds iteration to the SnapLogic programming model. With PipeLoop, we can create new workflows that are previously hard to manage or even impossible. 

What is PipeLoop

        PipeLoop is a new Snap for iterative execution on a pipeline. For people who are familiar with iterations within programming languages, PipeLoop is essentially a do-while loop for pipelines. The user is required to provide an iteration limit as a hard cutoff to avoid resource depletion or infinite loop, and an optional stop condition to control the execution.

        Just like we can pass input documents to PipeExec, we can also pass input documents to PipeLoop, the difference between the two is that the output document of the pipeline executed with PipeLoop will be used as the next round of input to continue the execution until the stop condition is met or limit is reached. Due to this unique mechanism, the pipeline run by PipeLoop must have one unlinked input and one unlinked output to work properly. To put it simply, PipeLoop can be thought of as chaining a bunch of PipeExec Snaps with the same pipeline with variable length and a condition to exit early. 

tfan_0-1730388809432.png

PipeLoop execution flow

1. Input documents to PipeLoop are passed to the child pipeline for execution. 2. Child pipeline executes. 3. Child output is collected. 4. Evaluate stop condition based on document output. If true, exit and pass the output document to PipeLoop, otherwise continue. 5. Check if the iteration limit is reached. If true, exit and pass the output document to PipeLoop, otherwise continue. 6. Use the output document as the next round of input and continue (1.)

 

PipeLoop execution walkthrough

        Let’s start with a very simple example. We’ll create a workflow using PipeLoop that increments a number from 1 to 3. For simplicity, we will refer to the pipeline with PipeLoop as the “Parent pipeline”, and the pipeline that is executed by PipeLoop as the “Child pipeline”.

Parent pipeline setup

tfan_1-1730388910173.png

        The parent pipeline consists of one JSON Generator Snap with one document as input, and one PipeLoop Snap running the pipeline “child” with stop condition “$num >= 3”. We’ll also enable “Debug Iteration output” to see the output of each round in this walkthrough.

 

Child pipeline setup

tfan_2-1730388910307.png

        The child pipeline consists of a single mapper snap that increments “$num” by 1, which satisfies the requirement “a pipeline with one unlinked input and one unlinked output” for a pipeline to be run by PipeLoop.

 

Output 

        The output of PipeLoop consists of two major sections when Debug mode is enabled: the output fields, and _iteration_documents. We can see the final output is “num”: 3, which means PipeLoop has successfully carried out the task.

tfan_3-1730388910171.png

 

PipeLoop features

        There are multiple features in PipeLoop that can be helpful when building iterating pipelines. We’ll categorize them from where the features are located.

Properties

tfan_4-1730388910174.png

There are 4 main sections in the property of the PipeLoop Snap.

  • Pipeline
  • Pipeline Parameters
  • Loop options
  • Execution Options

Pipeline

        The pipeline to be run.

Pipeline Parameters

        We’ll take a deeper dive into this in the Pipeline Parameters section.

Loop options

        Loop options are property settings that are related to iterations of this snap. 

    Stop condition

        The Stop condition field allows the user to set an expression to be evaluated after the first execution has occurred. If the expression is evaluated to true, the iteration will be stopped. The stop condition can be also set to false if the user wishes to use this as a traditional for loop. 

        There are cases where the user might pass an unintended value into the Stop condition field. In this scenario, PipeLoop generates a warning when the user provides a non-boolean String as the Stop condition, while the stop condition will be treated as false.

tfan_5-1730388910174.png

Non-boolean Stop condition warning

    Iteration limit

        The Iteration limit field allows the user to limit the maximum number of iterations that could potentially occur. This field can also be used to limit the total number of executions if the Stop condition is set to false.

        Setting a large value for the Iteration limit with debug mode on could be dangerous. The accumulated documents could quickly deplete CPU and RAM resources. To prevent this, PipeLoop generates a warning in the Pipeline Validation Statistics tab when the Iteration limit is set to greater than or equal to 1000 with Debug mode set to enabled. 

tfan_6-1730388910183.png

Large iteration limit with debug mode enabled warning

    Debug iteration outputs

        This toggle field enables the output from the child pipelines for each iteration and the stop condition evaluation to be added into the final output as a separate field.

tfan_7-1730388910171.pngOutput example with Debug iteration outputs enabled

Execution options

    Execute On

        To specify where the pipeline execution should take place. Currently only local executions (local snaplex, local node) are supported.

    Execution Label

        We’ll take a deeper dive into this in the Monitoring section.

Pipeline Parameters

        For users that are familiar with Pipeline Parameters in PipeExec, feel free to skip to the next section as the instructions are identical.

Introduction to Pipeline Parameters

        Before we take a look at the Pipeline Parameters support in the PipeLoop Snap, let’s take a step back and see what pipeline parameters are and how pipeline parameters can be leveraged.

        Pipeline parameters are String constants that can be defined in the Edit Pipeline Configuration settings. Users can use the parameters as a constant to be used anywhere in the pipeline. One major difference for Pipeline parameters and Pipeline variables is that Pipeline parameters are referred using an underscore prefix, whereas Pipeline variables are referred using a dollar sign prefix.

tfan_8-1730388910175.pngPipeline Parameters in Edit Pipeline Configuration 

tfan_9-1730388910188.pngAccessing Pipeline Parameters in an expression field

 

Example 

        Let’s take a look at Pipeline Parameters in action with PipeLoop. Our target here is to print out “Hello PipeLoop!” n times where n is the value of “num”.

        We’ll add two parameters in the child pipeline, param1 and param2. To demonstrate, we assign “value1” to param1 and keep it empty for param2. We’ll then add a message field with the value “Hello PipeLoop!” in the JSON Generator so that we can assign the String value to param2. Now we’re able to use param2 as a constant in the child pipeline. PipeLoop also has field name suggestions built in the Parameter name fields for ease of use. 

tfan_10-1730388910308.png

 

PipeLoop Pipeline Parameters in action

        For our child pipeline, we’ll add a new row in the Mapping table to print out “Hello PipeLoop!” repeatedly (followed with a new line character). One thing to bear in mind is that the order of the Mapping table does not affect the output (the number of “Hello PipeLoop!” printed in this case), as the output fields are updated after the execution of current iteration is finished. 

tfan_11-1730388910191.png

 

Child Pipeline configuration for our task

        Here’s the final result, we can see “Hello PipeLoop!” is being printed twice. Mission complete.

tfan_12-1730388910200.png

Remarks

  • Pipeline Parameters are String constants that can be set in Edit Pipeline Configuration.
  • Users can pass a String to Pipeline Parameters defined in the Child pipeline in PipeLoop.
  • Pipeline Parameters in PipeLoop will override previous pipeline parameter values defined in the Child pipeline if the parameters share the same name.
  • Pipeline Parameters are constants, which means the values will not be modified during iterations even if the users did so.

Monitoring

        When a snap in a pipeline is executed, there will not be any output until the execution is finished. Therefore, due to the nature of iterating pipeline execution as a single snap, it is slightly difficult to know where the execution is currently at, or which pipeline execution is corresponding to which input document. To deal with this, we have two extra features that can add more visibility to the PipeLoop execution.

Pipeline Statistics progress bar

        During the execution of PipeLoop, a progress bar will be available in the Pipeline Validation Statistics tab, so that the user can get an idea of which iteration the PipeLoop is currently at. Note that the progress bar might not reflect the actual iteration index if the child pipeline executions are short, due to polling intervals. 

tfan_13-1730388910191.png

PipeLoop iteration progress bar

Execution Label

        When a PipeLoop with multiple input documents is executed, the user will not be able to tell which pipeline execution is linked to which input document in the SnapLogic Monitor. Execution label is the answer to this problem. The user can pass in a value in the Execution label field that can differentiate input documents so that each input document will have its own label in the Snaplogic Monitor during Execution.

        Here’s an example of two input documents running on the child pipeline. We set the Execution label with the expression “child_label” + $num, so the execution for the first document will have the label “child_label0” and the second execution will have the label “child_label1”. 

tfan_14-1730388910309.pngExecution label settings

 

tfan_15-1730388910172.pngSnapLogic Monitor View

Summary

        In this article, we introduced PipeLoop, a new Snap for iterative execution workflows. The pipeline run by PipeLoop must have one unlinked input and one unlinked output.

PipeLoop has the following features:

 

 

  • Pipeline Parameters support
  • Stop condition to exit early with warnings
  • Iteration limit to avoid infinite loop with warnings
  • Debug mode
  • Execution label to differentiate runs in Monitor
  • Progress bar for status tracking

Happy Building!