SnapLogic - Integration Nation

ddellsperger · ‎12-01-2022

I’ve been a pretty big fan of Advent of Code since I found out about it in 2019 and when I started with SnapLogic in 2021, I figured it could be pretty cool to try to use the SnapLogic IIP to solve all (well, not all, but at least some) of the Advent of Code daily puzzles, mostly to learn better how some of the snaps work, but also get more experience designing pipelines since I’m typically working more in the individual snap development.

This year, I figured I’d post about it here in the community, and to see if others have an interest in attempting to solve the daily puzzles on SnapLogic IIP. I think there are a certain level of these problems that ARE solvable via the IIP, and some that aren’t due to some of them which aren’t.

My ground rules for considering a day solved is:

Get the input into a pipeline in whatever way possible, either via file download and read, or via Constant snap (My posted examples will use the sample input with Constant snap, but my final solutions typically will use a file reader)
No use of the Script Snap (if it can’t be solved without a script snap, it’s considered unsolvable, but you’d be surprised what things you can do without a script snap with our snaps)
No use of external services (databases, rest endpoints, etc) as those are likely to have some level of “cheating” associated with them similar to a script snap
Basically, using only the transform, flow, and file reader/writer (to read input files, create, delete, read, and write temporary files, and write final output files)
Pipe Execs are allowed

I figure this might be something that other members of the community might be interested in doing, if you want to participate, feel free to join in on the conversation, I figure we can probably keep discussion to a single thread and do replies per day? Not sure how many might be interested in this, though.

What is Advent of Code?
From the website:

Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like. People use them as interview prep, company training, university coursework, practice problems, a speed contest, or to challenge each other.

You don’t need a computer science background to participate - just a little programming knowledge and some problem solving skills will get you pretty far. Nor do you need a fancy computer; every problem has a solution that completes in at most 15 seconds on ten-year-old hardware.

If you want to join in, go to https://adventofcode.com and connect one of the authentication mechanisms. (I use github, but you can auth with google, twitter, or reddit as well) Logging in for an account is required so that you can receive input specific to you.

If you plan to join and want to join a leaderboard for this project, feel free to join my private leaderboard with the code 1645534-1249c834.

ddellsperger · ‎12-05-2022

Day 5 was the first day that really becomes difficult on SnapLogic, and it’s mostly because you need to loop with saving state from each iteration in order to process the individual moves. I also had to do this with the creation of the initial data configuration which required its own loop with state being preserved. Now, I used the same file for both of these, so it’s probably not a huge deal, but this is one of the situations where you effectively have to create a pipeline that reads from a file, and writes to that same file to obtain and preserve state between iterations (via input documents).

I think some callouts here are that when doing my internal pipeline testing, I used an internal json generator, because state is/was preserved in a spot different from the pipeline itself, this made it somewhat hard to do, but allowed me to change values as necessary, when it came time for integrating the pipeline, I’d just disconnect and disable the JSON generator and run the pipelines as required. Seems it might be nice to have some sort of way to debug that more readily, I know we have the replay snap, but I found it didn’t really work, because unlike a normal pipeline, I have to wait for full pipeline executions to run before I run another round of it, so that became somewhat more cumbersome than just testing a single instance, etc.

I think maybe the big callout here is using local (to snapplex) files rather than files in sldb or on the web somewhere, while it SEEMS fast to use those files, the time to access a file local to the plex is much faster. This might be one of those cases where it would be nice to have like a plex-centric internal cache to be able to leverage specifically in order to deal with state-based looping within SnapLogic, being able to set that state prior to a pipe exec (or as maybe part of the pipe exec?) and then synchronizing the output of the pipe exec to read the final output at that time. This form of looping becomes VERY cumbersome, but going through it now means it should be easier in the future?

I also found that parsing the moves "move x from y to z" was a two-snap process using regex, the first to get the regex output and the second to map the output appropriately. I’m not sure how common this type of parsing is used, but it might be something where it’s worth potentially putting some sort of document-based parser logic into a snap to make this kind of parsing easier in the future.

Screenshots

SLP Files:
Main Pipeline:
day05_2022_12_05.slp (78.3 KB)

Puzzle Generation pipeline:
day05_generator_2022_12_05.slp (13.6 KB)

Puzzle Processing pipeline:
day05_processor_2022_12_05.slp (16.8 KB)

ddellsperger · ‎12-06-2022

Day 6 has the fun situation where you have to do a sliding window of data traversal. I’ve done this a FEW times in the past, but know that the typical easiest way to do it is to have an array generated with all of the potential start or end indexes available to you. Since the Sequence Snap only allows pipeline parameters, I have mine saved from last year of doing this where you pass a value to the pipe exec and it returns an array of the appropriate size based on the pipeline parameter. Here’s that pipeline

Screenshot:

SLP File:
generate_sequence_array_2022_12_06.slp (4.7 KB)

Okay now that we’ve cleared up the hardest part of this problem, my final solution for Day 6 was to use this generate_sequence_array to generate the array the size of the input data, and then generate the 4-character sequences, once I did that it provided the ability to do processing via a filter, group by field, then further filter with a final aggregate snap to get the proper answer. I only changed a few things between the two parts, but it was a generally tricky, but not impossible problem to do.

Screenshot:

SLP File:
day06_2022_12_06.slp (35.4 KB)

ddellsperger · ‎12-07-2022

@tlikarish informed me of a way to avoid the sub-pipeline for generate_sequence_array by using the sl.range function from the expression language, huge revelation for me, and that will hopefully simplify future pipeline examples that I might need to make!

ddellsperger · ‎12-07-2022

Day 7 was one that I wasn’t sure I’d be able to do in SnapLogic because of the solution I did via Java earlier. But after some time, I was able to reason through the problem kind of going step-by-step to get to the final data necessary to complete both parts. This was defined during a twitch livestream that I watched last night by the creator as a “recursion” problem, and that was the red flag that indicated to me that I might not be able to complete it in the IIP.

There are some aspects of recursion, but I also had a way to do this by splitting things from 1) changing directories to get all possible 2) directory listings, then 3) adding files to all directories under the current directory. By building up those full directory paths AND including all directories potentially impacted by the directories at the same time, you kill two birds with one stone (you also make backwards traversal of directories somewhat easier). I still needed one sub-pipeline for the looping that required state, which included building up those “current directories” because you need the previous directory to build the current one. I was able to reason around the final calculation of directory size initially with a sub-pipeline, but the way that I was doing it lead me to the final solution doing an aggregate with group by which was a cleaner (and faster) solution.

I think this is a really good example problem where you might call for one way of programming/processing design when you first look at a problem, but you’re very much able to break the problem a bit further to better support a more linear approach (rather than recursive approach) to a solution. Very cool problem to work with, and kind of cool to see the morphing when it comes to analyzing the pipeline design. As I was typing this post up, is where I actually made the realization that I could remove a second sub-pipeline that again wrote state to a file.

Screenshots:

SLP Files:
day07_2022_12_07 (1).slp (54.0 KB)
day07_get_current_directory_2022_12_07.slp (19.0 KB)

ddellsperger · ‎12-08-2022

Day 8 was another pretty interesting puzzle, taking a standard grid-like problem and making it a bit more interesting to work with. The first step is almost always collecting all of the grind points and the value associated with that, luckily we didn’t have to do looping in this because we could copy and generate some standard grouping of all rows and all columns since we’re just doing up, down, left, right looking (nothing diagonal, etc). There’s a number of mapper steps that I take at the end that MIGHT be able to be combined into fewer steps, but I think I liked having the ability to turn on “Pass Through” and debug data as necessary during development.

The one thing I’ll say about doing this problem is that the expression language (and javascript) sort is dumb when you don’t provide a callback function, and as a result, it should probably be taken out back and given a talking to (that was a nice bug to resolve only on my puzzle input data). The other potential note here is that some of these things (as indicated by the warning signs on my “Collect All”) are probably not necessarily great for long-term uses, if you needed to do some of this collection, it might be better to stage in a local disk file, or do group by fields to have fewer collections, with the size of data we have for these puzzles, this is probably fine, though.

Screenshot:

SLP File:
day08_2022_12_08.slp (48.1 KB)

SnapLogic - Integration Nation

Advent of Code via SnapLogic IIP