Advent of Code via SnapLogic IIP

I’ve been a pretty big fan of Advent of Code since I found out about it in 2019 and when I started with SnapLogic in 2021, I figured it could be pretty cool to try to use the SnapLogic IIP to solve all (well, not all, but at least some) of the Advent of Code daily puzzles, mostly to learn better how some of the snaps work, but also get more experience designing pipelines since I’m typically working more in the individual snap development.

This year, I figured I’d post about it here in the community, and to see if others have an interest in attempting to solve the daily puzzles on SnapLogic IIP. I think there are a certain level of these problems that ARE solvable via the IIP, and some that aren’t due to some of them which aren’t.

My ground rules for considering a day solved is:

  • Get the input into a pipeline in whatever way possible, either via file download and read, or via Constant snap (My posted examples will use the sample input with Constant snap, but my final solutions typically will use a file reader)
  • No use of the Script Snap (if it can’t be solved without a script snap, it’s considered unsolvable, but you’d be surprised what things you can do without a script snap with our snaps)
  • No use of external services (databases, rest endpoints, etc) as those are likely to have some level of “cheating” associated with them similar to a script snap
  • Basically, using only the transform, flow, and file reader/writer (to read input files, create, delete, read, and write temporary files, and write final output files)
  • Pipe Execs are allowed

I figure this might be something that other members of the community might be interested in doing, if you want to participate, feel free to join in on the conversation, I figure we can probably keep discussion to a single thread and do replies per day? Not sure how many might be interested in this, though.

What is Advent of Code?
From the website:

Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like. People use them as interview prep, company training, university coursework, practice problems, a speed contest, or to challenge each other.

You don’t need a computer science background to participate - just a little programming knowledge and some problem solving skills will get you pretty far. Nor do you need a fancy computer; every problem has a solution that completes in at most 15 seconds on ten-year-old hardware.

If you want to join in, go to https://adventofcode.com and connect one of the authentication mechanisms. (I use github, but you can auth with google, twitter, or reddit as well) Logging in for an account is required so that you can receive input specific to you.

If you plan to join and want to join a leaderboard for this project, feel free to join my private leaderboard with the code 1645534-1249c834.

7 Likes

I’m uploading my Day 1 solution, but I highly suggest trying this out yourself first. I found it to be something that was really hard to split the individual groups of Elves items since in the example, you’re thinking maybe 3 is the most, but as I looked at my input, I saw 10+ items. I kind of used a hack that I learned about the “Group by Field” snap with some other strange properties to do this, but once you have the groups, the rest of this is pretty simple (I think) to solve.
Screenshot of Solution (Part 1 on top, Part 2 on bottom):

SLP File:
day01_2022_12_01.slp (28.5 KB)

1 Like

Day 2 was pretty simple with SnapLogic IIP, it was a row-by-row processing (this happens with some of the Advent of Code Puzzles), for the most part, I’m just using a router for the opponent’s option and then mappers to determine my score for the given scenario. This is actually a pretty cool example of simply getting started with using the Router/Union for people who might be new to the SnapLogic IIP. These kinds of puzzles end up being pretty fun to figure basically as soon as I see them, I know the IIP will be a great option for solving this problem.

Screenshot:

SLP File:
day02_2022_12_02.slp (29.0 KB)

1 Like

Day 3 was actually another pretty impressive use of SnapLogic, this time I leveraged some really cool properties of data to where I could simply use the expression language to split the letters in the different compartments or rucksacks (depending on the part) and then use a filter + unique to get the one letter that’s common between all of them, by keeping the other compartment/rucksack, that made the unique work better and frankly allowed for less usage of the expression language (which I find to be handy when I’m trying to learn about the different transformation/flow snaps). I also wanted to learn with the In-Memory Lookup for aligning the final item or badge to the score for that item or badge, and I feel like I understand that better now too.

Screenshot:

SLP File:
day03_2022_12_03.slp (37.9 KB)

1 Like

Day 4 seemed easier than it probably should’ve been (first full weekend puzzle is usually a tough one in previous years), hardest part is reasoning over the logic for the complete overlap vs. partial overlap, The problem example had some situations where it MIGHT pass locally, but fail with your input due to the few side-tracks, I made some assumptions (that to me felt safe) with how the data was represented, but so far nothing in Advent of Code has really pushed anything beyond a LITTLE bit of semi-advanced expression language at this point, though I feel like I know what’s coming up based on previous years and will probably soon have a situation where there’s some basic programming concept that IIP flow simply doesn’t really work with.

Screenshot:

SLP File:
day04_2022_12_04.slp (20.3 KB)

Day 5 was the first day that really becomes difficult on SnapLogic, and it’s mostly because you need to loop with saving state from each iteration in order to process the individual moves. I also had to do this with the creation of the initial data configuration which required its own loop with state being preserved. Now, I used the same file for both of these, so it’s probably not a huge deal, but this is one of the situations where you effectively have to create a pipeline that reads from a file, and writes to that same file to obtain and preserve state between iterations (via input documents).

I think some callouts here are that when doing my internal pipeline testing, I used an internal json generator, because state is/was preserved in a spot different from the pipeline itself, this made it somewhat hard to do, but allowed me to change values as necessary, when it came time for integrating the pipeline, I’d just disconnect and disable the JSON generator and run the pipelines as required. Seems it might be nice to have some sort of way to debug that more readily, I know we have the replay snap, but I found it didn’t really work, because unlike a normal pipeline, I have to wait for full pipeline executions to run before I run another round of it, so that became somewhat more cumbersome than just testing a single instance, etc.

I think maybe the big callout here is using local (to snapplex) files rather than files in sldb or on the web somewhere, while it SEEMS fast to use those files, the time to access a file local to the plex is much faster. This might be one of those cases where it would be nice to have like a plex-centric internal cache to be able to leverage specifically in order to deal with state-based looping within SnapLogic, being able to set that state prior to a pipe exec (or as maybe part of the pipe exec?) and then synchronizing the output of the pipe exec to read the final output at that time. This form of looping becomes VERY cumbersome, but going through it now means it should be easier in the future?

I also found that parsing the moves "move x from y to z" was a two-snap process using regex, the first to get the regex output and the second to map the output appropriately. I’m not sure how common this type of parsing is used, but it might be something where it’s worth potentially putting some sort of document-based parser logic into a snap to make this kind of parsing easier in the future.

Screenshots



SLP Files:
Main Pipeline:
day05_2022_12_05.slp (78.3 KB)

Puzzle Generation pipeline:
day05_generator_2022_12_05.slp (13.6 KB)

Puzzle Processing pipeline:
day05_processor_2022_12_05.slp (16.8 KB)

1 Like

Day 6 has the fun situation where you have to do a sliding window of data traversal. I’ve done this a FEW times in the past, but know that the typical easiest way to do it is to have an array generated with all of the potential start or end indexes available to you. Since the Sequence Snap only allows pipeline parameters, I have mine saved from last year of doing this where you pass a value to the pipe exec and it returns an array of the appropriate size based on the pipeline parameter. Here’s that pipeline

Screenshot:

SLP File:
generate_sequence_array_2022_12_06.slp (4.7 KB)

Okay now that we’ve cleared up the hardest part of this problem, my final solution for Day 6 was to use this generate_sequence_array to generate the array the size of the input data, and then generate the 4-character sequences, once I did that it provided the ability to do processing via a filter, group by field, then further filter with a final aggregate snap to get the proper answer. I only changed a few things between the two parts, but it was a generally tricky, but not impossible problem to do.

Screenshot:

SLP File:
day06_2022_12_06.slp (35.4 KB)

@tlikarish informed me of a way to avoid the sub-pipeline for generate_sequence_array by using the sl.range function from the expression language, huge revelation for me, and that will hopefully simplify future pipeline examples that I might need to make!

1 Like

Day 7 was one that I wasn’t sure I’d be able to do in SnapLogic because of the solution I did via Java earlier. But after some time, I was able to reason through the problem kind of going step-by-step to get to the final data necessary to complete both parts. This was defined during a twitch livestream that I watched last night by the creator as a “recursion” problem, and that was the red flag that indicated to me that I might not be able to complete it in the IIP.

There are some aspects of recursion, but I also had a way to do this by splitting things from 1) changing directories to get all possible 2) directory listings, then 3) adding files to all directories under the current directory. By building up those full directory paths AND including all directories potentially impacted by the directories at the same time, you kill two birds with one stone (you also make backwards traversal of directories somewhat easier). I still needed one sub-pipeline for the looping that required state, which included building up those “current directories” because you need the previous directory to build the current one. I was able to reason around the final calculation of directory size initially with a sub-pipeline, but the way that I was doing it lead me to the final solution doing an aggregate with group by which was a cleaner (and faster) solution.

I think this is a really good example problem where you might call for one way of programming/processing design when you first look at a problem, but you’re very much able to break the problem a bit further to better support a more linear approach (rather than recursive approach) to a solution. Very cool problem to work with, and kind of cool to see the morphing when it comes to analyzing the pipeline design. As I was typing this post up, is where I actually made the realization that I could remove a second sub-pipeline that again wrote state to a file.

Screenshots:


SLP Files:
day07_2022_12_07 (1).slp (54.0 KB)
day07_get_current_directory_2022_12_07.slp (19.0 KB)

1 Like

Day 8 was another pretty interesting puzzle, taking a standard grid-like problem and making it a bit more interesting to work with. The first step is almost always collecting all of the grind points and the value associated with that, luckily we didn’t have to do looping in this because we could copy and generate some standard grouping of all rows and all columns since we’re just doing up, down, left, right looking (nothing diagonal, etc). There’s a number of mapper steps that I take at the end that MIGHT be able to be combined into fewer steps, but I think I liked having the ability to turn on “Pass Through” and debug data as necessary during development.

The one thing I’ll say about doing this problem is that the expression language (and javascript) sort is dumb when you don’t provide a callback function, and as a result, it should probably be taken out back and given a talking to (that was a nice bug to resolve only on my puzzle input data). The other potential note here is that some of these things (as indicated by the warning signs on my “Collect All”) are probably not necessarily great for long-term uses, if you needed to do some of this collection, it might be better to stage in a local disk file, or do group by fields to have fewer collections, with the size of data we have for these puzzles, this is probably fine, though.

Screenshot:

SLP File:
day08_2022_12_08.slp (48.1 KB)

1 Like

Day 9 started the fun of “to solve this problem, the logic is easy, but since we need continual state, it’s going to take a long time” advent of code solutions. My sample pipeline solution took 1 minute to run, and that was for a total of under 100 “head” movement. Then I saw the input file had 2,000 lines and realized this was going to be an over 1-hour process to get completed. My first iteration was able to process 1.7 documents per second (where each document is a 1-unit step in the puzzle) which my input file had 11,240 steps, so that took roughly 2 hours to complete (8 minutes short of 2 hours). I’m currently running with the setup you see in this post, I assume it will work properly, but I’m currently processing at 2.4-2.5 documents per second. Based on the new speed, it SHOULD complete in about 80 minutes. This is still a long time, but I’m still impressed that this is even possible in SOMEWHAT of a normal timeframe.

The improvement in speed is that rather than having the tail movement pipeline read from the file and write to the file, I’m simply having it update the data in-place, with a pipeline parameter to define the index of the output array to actually process. This was previously done via passed-in parameter and file name for the index and filename where it was read, and then all processing done. That Disk IO really does impact the processing time for things, so these are the scenarios where it’d be nice to have something to do with state, etc. I would say that today’s puzzle is more of a practice in patience and really testing every step of a sub-pipeline that you may need to process. While testing is hard, you can see that I’ve just leveraged a JSON Generator that I disable and disconnect for final running for the internal values. I also just disable the pipeline calls to get the first state of things configured to begin with.

Another interesting problem, and unfortunately a very long runtime to get the solution. Below are the screenshots from the runtime of the pipelines, then the pipelines themselves and SLP Files (I’ll edit/update to add the second runtime time screenshot)
With sub-sub-pipeline reading/writing file (1 hour, 51 minutes):
sub-sub pipeline reads from/writes to file for state

With sub-sub-pipeline just editing the structure itself (1 hour, 18 minutes):
sub-sub pipeline simply stacks calls and data

Screenshots:



SLP Files:
day09_2022_12_09.slp (32.4 KB)
day09_move_2022_12_09.slp (38.3 KB)
day09_tail_move_2022_12_09.slp (10.7 KB)

Day 10 was a rather interesting problem to do via SnapLogic, and I thought I might already have to deal with more state-saving in a loop, but I found a way around it this time (yes, I’m very excited about this). We know this input is pretty small based on part 2, but by collecting all inputs, you can calculate the sum of some value at a given index with slice and then reduce for that value. The hardest part for this was determining what each part was really wanting when it comes to those values (I struggled with that when I solved the problem via Java at midnight).

Frankly, there’s nothing super stand-out of this solution, other than again finding a way to potentially re-imagine the problem in a different frame. We could’ve done what we did yesterday for Day 9 and saved state in a loop, but with the smaller input data that we had, I figured this was a pretty good way to show how you can somewhat avoid having to save state in these kinds of situations.

Screenshot:

SLP File:
day10_2022_12_10.slp (38.1 KB)

1 Like

Day 11 is one of those days/puzzles where SnapLogic simply makes things take a long time, but there were some shortcuts that I could and did take to speed it up. Every time you have to preserve state, you effectively incur a time penalty (this is why the script snap can be so handy in situations like these). My current solution for part 2 still hasn’t completed, but I’m confident (when comparing to my code-based one) that it will work without any issues. Currently, part 2 (which requires 10,000 iterations) is running at 1.2 rounds per second, so that SHOULD finish in 2 hours, 19 minutes (roughly) which seems to be yet another repeat of Day 9 with respect to speediness. The key with speeding up these iterations, again, is using plex-local files (file:///tmp/file).

For this solution, I have 5 total pipelines, one for the full day (day11), then 1 for each major part (part 1 and part 2) as I found some performance impact by using a pipeline parameter to determine further downstream operations and flip the inner per-monkey processing loop. While the two inner processing loops are similar, they weren’t similar enough to really be able to use a pipeline parameter (in my opinion) to deal with it, though I did work with the example (4 monkeys) vs. input (8 monkeys) input in a fairly good way (in my opinion). You’ll notice that my “round” pipelines have 8 pipeline executes in a row, which all layer on top of each other, since I know at most we’d have 8 monkeys to process, I figured repeated pipeline executes made more sense in the end than trying to do another loop with saving state. I already knew that 10,000 iterations would be rough to do saving state for each of those, but multiplying that by 8 would slow it down even more, so in this case, taking a manual step can really help (if you know the scope of limitations. This is also why there’s a router and union for instances where the inner array doesn’t match the number being processed, it short-circuits the processing a bit more.

At midnight when I finished my coded solution, I never thought I’d be able to finish this puzzle using SnapLogic, and it’s a LONG runtime, but I’m still pretty impressed with being able to write up the solution and execute it in some amount of time that might be considered “valid” in some definition. You’ll notice the part 1 and part 2 specific items LOOK the same, and they’re both very close, but it’s the inner pieces that are different, I probably could combine those into just 2 pipelines rather than 4, but figured keeping down on the potential introduction of more issues was the best way to go forward, I’ll make a few more updates (after it runs successfully) and report back as necessary, but at least the outer loop should be able to be generic to both parts without too many issues.

Screenshots:





SLP Files:
day11_2022_12_11.slp (55.5 KB)
day11_round_part1_2022_12_11.slp (28.3 KB)
day11_process_monkey_part1_2022_12_11.slp (31.9 KB)
day11_round_part2_2022_12_11.slp (28.5 KB)
day11_process_monkey_part2_2022_12_11.slp (32.3 KB)

1 Like

After 2 hours, 10 minutes it completed and the answer is right!

Day 12 Is one of those that I’m not sure is possible without scripts in SnapLogic, This problem almost entirely relies on algorithms like A* to complete, and graph traversal and recursive creation of paths is simply not something that is really going to be possible to complete with SnapLogic, sadly. I’ll probably work on parsing the data into a grid and looking at candidate paths via creation in a script snap of some sort and then doing the last processing via SnapLogic, but I’m not sure it will provide all that much value in the end. This is sometimes the problem with programming puzzles, they require some items which SnapLogic IIP can’t really perform with its standard snaps, but we’ll see what kind of things we can work through with these problems.

1 Like

Day 12 is one that I couldn’t leave alone, and while I don’t consider this “solved” by the stretch of imagination (maybe I’ll convert it eventually) I do consider this a somewhat viable solution for other circumstances, specifically for circumstances when you need an indeterminate loop like this. For this, I figured I’d mess with the script snap and since python is my #2 language, write some python. Below is my screenshot and now attached pipeline (technically, I think I could solve this with a single loop across all coordinates, but maybe I’ll work on that eventually and link it back here). I did the processing to extract the data from the input file still within some mapper, JSON Splitters, Filters, etc. and then passed the data to the script snap to calculate the shortest distance available. The same script was used in both pipelines (by design) and I will be working on likely a better representation of it using pipeline execute calls nested, one to stage the base “data” then the second to process it (we already know processing 10,000 items takes a while with this setup, so I can only imagine this will take a day to process at least).

Screenshot:

SLP File:
day12_2022_12_12.slp (49.5 KB)

1 Like

Day 13 seems like another “probably can’t do it in SnapLogic” problems. We’d have some hope if there were some sort of value output when it comes to each individual row of the input data and comparing it, etc. We’ll have Day 13 on hold for using SnapLogic IIP (this was almost always a potential known that we’d have some limits with the puzzles). This is one where both parts would effectively be all in the script snap, we could somewhat work around it yesterday by outputting some intermediate values in part 2, but our sort snap doesn’t take an arbitrary comparator, which makes this more difficult.

1 Like

Day14 needed another set of fairly good recursion in order to operate and Day 15 is something I think might be possible, but likely not something I’ll do today, so I’ll post back to see if I’m able to solve it via SnapLogic at some point, there will be some items that are easier than others, part 1 of Day 15 seems maybe a bit more possible than part 2, just based on the loops that I wrote, etc.

1 Like

Okay, it’s been a while since I’ve posted on here, basically every day since Day 12 or 13 has required some sort of while loop shenanigans (typically for a Breadth First or Depth First search). There’s been a few days since then where maybe ONE part is possible, but I might look at doing at least the first part(s) in at least a few of the days. (My java solutions personally have taken longer than they probably should’ve, so I didn’t want to throw too much more time into it)

2 Likes

This series is really helpful @ddellsperger, I will go through all over again in the New year.
Missed on the initial days but would be catching up soon.

3 Likes