ContributionsMost RecentMost LikesSolutionsHow to keep first row in a set? I have a requirement where I need to capture the first record with a given field needs to pass through and all subsequent records with that field value are stripped and written to a different file. In the example below, only the first row for each ID needs to pass through. ID Type A Alpha B Alpha B Beta C Gamma C Delta So the Pipeline needs to output this: ID Type A Alpha B Alpha C Gamma Then output this to a file to let the admins know that a duplicate row was found for these records: ID Type B Beta C Delta I have a Pipeline that inserts row numbers with a Sequence, copies the data stream, does an Aggregate with a Group By, then finishes with a Join to bring the other fields back in. Here is the Aggregate Snap (I changed the Group By field name to Type in order to match the example above, so please ignore the warning) This works to retrieve the first row, but I have two questions: Is this really the best way to do this? How do I capture the duplicate rows that were rejected by the Join? Why does the Box Write have a mandatory output view? Every other write Snap that I see has zero output views by default, but for some reason the Box Write has a minimum of one output view. I have hooked up an Exit Snap to the Box Write Snap, setting it to trigger with a threshold of 10 (I believe the box write outputs a single document), but what is the best way to handle this? Re: Using simple wildcards in a Filter Snap? Thank you for the suggestion. I think that I’m just going to write the regex for them. Using simple wildcards in a Filter Snap? I have a pretty simple Pipeline that gets a directory listing from Box, filters those files according to their filename, then calls Pipeline Execute to fetch the contents of the matched files using this syntax: $FileName.match(_FILE_NAME_MATCH)!=null Here is the filter: I have six files in this folder: When I set FILE_NAME_MATCH to “test” I get the two matching files: I get the same result when I set this variable to “test*”. However, I get zero results when I set it to “test*.csv”: I checked the same regex in JavaScript and no, it’s not a valid search query (i.e. SnapLogic is behaving as expected) Given that my sysadmins who will be setting up the parameters for these jobs don’t know regex, is there a way for me to use simple Windows File System wildcard searching in the Filter Snap? They may want to scan for files like this… output??report(*).xls? …and not have to know the regex to get it. Re: How exactly do you pass parameters into a Pipeline Execute? I changed the format of the Excel file around to look like this: Then I set up the Pipeline Execute like this: Which gave me the desired output from the Pipeline Execute: For what it’s worth, the first file had rows with ABC, DEF, and GHI. The second file had rows with JKL, MNO, and PQR. Here’s the final output of the CSV Parser: I’m sure other users wanting to do this would find it helpful to have a note about this in the Pipeline Execute documentation. How exactly do you pass parameters into a Pipeline Execute? The docs for Pipeline Execute say this about the input view: Documents or binary data received on this view are sent to the child execution if the child Pipeline has an unlinked input view. The document or binary data can be used to specify Pipeline parameters for the child execution. My question is “how exactly does this work”? I would like to read some parameters out of an Excel file stored in Box and send those to a Child Pipeline that will read the files that this Excel file specifies. In this case I want to read in test1.csv and test2.csv from this directory, and have this controlled from the parameters set in the Excel file. Here are my parameters in the Excel file: They are read into my Parent Pipeline as expected: I then call Pipeline Execute (w/Reuse Executions ON as I only want this Pipeline executed once) sending it the output of the Excel Parser. This Pipeline gets a directory listing from the specified folder, filters them based on the specified match string, then reads the file from Box using another Pipeline Execute: I can see in the Parent Pipeline where I would connect the Input Schema into the Parameters, but I’m not sure how to get it to read the value properly. How can I get these parameter values from my Excel file into the Pipeline Execute? Re: Writing files with Child Pipeline - spawning multiple children? Thank you very much. Writing files with Child Pipeline - spawning multiple children? I have a simple debugging Pipeline that I would like to insert at various points in my main Pipeline: Two parameters are sent to the debugging pipeline: The router uses the DEBUG value to see if it needs to send output to the Excel formatter and File Writer. Here’s an example of how this is being called from the parent/master Pipeline, it reads the DEBUG parameter from the Pipeline level: When I add this into the parent/master pipeline it never finishes executing and it spawns a LOT of child pipelines according to the Dashboard. Files are being written to the system. Here is the section of the parent pipeline where the debug calls are being made, there are only three inserted right now: What is causing this to spawn so many processes and effectively freeze? When inserted into the parent/main pipeline with copy/paste they work exactly as expected. Re: Composite (JSON) Parameter? Thanks! I’ll be using this in future. Re: How to pass a variable along the pipeline? This is another use case that is effectively asking for this feature. The parallel issue comes up again, of course, but that doesn’t negate the functional requirement. Retrieve Filename into Mapper from Box Read Designing Pipelines How can I catch the Filename “Box Read” Snap opened into the “Mapper” snap? [Box%20Read%20File%20Name]