Forum Discussion

ddellsperger's avatar
3 years ago

Advent of Code via SnapLogic IIP

I’ve been a pretty big fan of Advent of Code since I found out about it in 2019 and when I started with SnapLogic in 2021, I figured it could be pretty cool to try to use the SnapLogic IIP to solve all (well, not all, but at least some) of the Advent of Code daily puzzles, mostly to learn better how some of the snaps work, but also get more experience designing pipelines since I’m typically working more in the individual snap development.

This year, I figured I’d post about it here in the community, and to see if others have an interest in attempting to solve the daily puzzles on SnapLogic IIP. I think there are a certain level of these problems that ARE solvable via the IIP, and some that aren’t due to some of them which aren’t.

My ground rules for considering a day solved is:

  • Get the input into a pipeline in whatever way possible, either via file download and read, or via Constant snap (My posted examples will use the sample input with Constant snap, but my final solutions typically will use a file reader)
  • No use of the Script Snap (if it can’t be solved without a script snap, it’s considered unsolvable, but you’d be surprised what things you can do without a script snap with our snaps)
  • No use of external services (databases, rest endpoints, etc) as those are likely to have some level of “cheating” associated with them similar to a script snap
  • Basically, using only the transform, flow, and file reader/writer (to read input files, create, delete, read, and write temporary files, and write final output files)
  • Pipe Execs are allowed

I figure this might be something that other members of the community might be interested in doing, if you want to participate, feel free to join in on the conversation, I figure we can probably keep discussion to a single thread and do replies per day? Not sure how many might be interested in this, though.

What is Advent of Code?
From the website:

Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like. People use them as interview prep, company training, university coursework, practice problems, a speed contest, or to challenge each other.

You don’t need a computer science background to participate - just a little programming knowledge and some problem solving skills will get you pretty far. Nor do you need a fancy computer; every problem has a solution that completes in at most 15 seconds on ten-year-old hardware.

If you want to join in, go to https://adventofcode.com and connect one of the authentication mechanisms. (I use github, but you can auth with google, twitter, or reddit as well) Logging in for an account is required so that you can receive input specific to you.

If you plan to join and want to join a leaderboard for this project, feel free to join my private leaderboard with the code 1645534-1249c834.

20 Replies

  • Day 11 is one of those days/puzzles where SnapLogic simply makes things take a long time, but there were some shortcuts that I could and did take to speed it up. Every time you have to preserve state, you effectively incur a time penalty (this is why the script snap can be so handy in situations like these). My current solution for part 2 still hasn’t completed, but I’m confident (when comparing to my code-based one) that it will work without any issues. Currently, part 2 (which requires 10,000 iterations) is running at 1.2 rounds per second, so that SHOULD finish in 2 hours, 19 minutes (roughly) which seems to be yet another repeat of Day 9 with respect to speediness. The key with speeding up these iterations, again, is using plex-local files (file:///tmp/file).

    For this solution, I have 5 total pipelines, one for the full day (day11), then 1 for each major part (part 1 and part 2) as I found some performance impact by using a pipeline parameter to determine further downstream operations and flip the inner per-monkey processing loop. While the two inner processing loops are similar, they weren’t similar enough to really be able to use a pipeline parameter (in my opinion) to deal with it, though I did work with the example (4 monkeys) vs. input (8 monkeys) input in a fairly good way (in my opinion). You’ll notice that my “round” pipelines have 8 pipeline executes in a row, which all layer on top of each other, since I know at most we’d have 8 monkeys to process, I figured repeated pipeline executes made more sense in the end than trying to do another loop with saving state. I already knew that 10,000 iterations would be rough to do saving state for each of those, but multiplying that by 8 would slow it down even more, so in this case, taking a manual step can really help (if you know the scope of limitations. This is also why there’s a router and union for instances where the inner array doesn’t match the number being processed, it short-circuits the processing a bit more.

    At midnight when I finished my coded solution, I never thought I’d be able to finish this puzzle using SnapLogic, and it’s a LONG runtime, but I’m still pretty impressed with being able to write up the solution and execute it in some amount of time that might be considered “valid” in some definition. You’ll notice the part 1 and part 2 specific items LOOK the same, and they’re both very close, but it’s the inner pieces that are different, I probably could combine those into just 2 pipelines rather than 4, but figured keeping down on the potential introduction of more issues was the best way to go forward, I’ll make a few more updates (after it runs successfully) and report back as necessary, but at least the outer loop should be able to be generic to both parts without too many issues.

    Screenshots:




    SLP Files:
    day11_2022_12_11.slp (55.5 KB)
    day11_round_part1_2022_12_11.slp (28.3 KB)
    day11_process_monkey_part1_2022_12_11.slp (31.9 KB)
    day11_round_part2_2022_12_11.slp (28.5 KB)
    day11_process_monkey_part2_2022_12_11.slp (32.3 KB)

    • ddellsperger's avatar
      ddellsperger
      Admin

      After 2 hours, 10 minutes it completed and the answer is right!

  • Day 12 Is one of those that I’m not sure is possible without scripts in SnapLogic, This problem almost entirely relies on algorithms like A* to complete, and graph traversal and recursive creation of paths is simply not something that is really going to be possible to complete with SnapLogic, sadly. I’ll probably work on parsing the data into a grid and looking at candidate paths via creation in a script snap of some sort and then doing the last processing via SnapLogic, but I’m not sure it will provide all that much value in the end. This is sometimes the problem with programming puzzles, they require some items which SnapLogic IIP can’t really perform with its standard snaps, but we’ll see what kind of things we can work through with these problems.

    • ddellsperger's avatar
      ddellsperger
      Admin

      Day 12 is one that I couldn’t leave alone, and while I don’t consider this “solved” by the stretch of imagination (maybe I’ll convert it eventually) I do consider this a somewhat viable solution for other circumstances, specifically for circumstances when you need an indeterminate loop like this. For this, I figured I’d mess with the script snap and since python is my #2 language, write some python. Below is my screenshot and now attached pipeline (technically, I think I could solve this with a single loop across all coordinates, but maybe I’ll work on that eventually and link it back here). I did the processing to extract the data from the input file still within some mapper, JSON Splitters, Filters, etc. and then passed the data to the script snap to calculate the shortest distance available. The same script was used in both pipelines (by design) and I will be working on likely a better representation of it using pipeline execute calls nested, one to stage the base “data” then the second to process it (we already know processing 10,000 items takes a while with this setup, so I can only imagine this will take a day to process at least).

      Screenshot:

      SLP File:
      day12_2022_12_12.slp (49.5 KB)

  • Day 13 seems like another “probably can’t do it in SnapLogic” problems. We’d have some hope if there were some sort of value output when it comes to each individual row of the input data and comparing it, etc. We’ll have Day 13 on hold for using SnapLogic IIP (this was almost always a potential known that we’d have some limits with the puzzles). This is one where both parts would effectively be all in the script snap, we could somewhat work around it yesterday by outputting some intermediate values in part 2, but our sort snap doesn’t take an arbitrary comparator, which makes this more difficult.

  • Day14 needed another set of fairly good recursion in order to operate and Day 15 is something I think might be possible, but likely not something I’ll do today, so I’ll post back to see if I’m able to solve it via SnapLogic at some point, there will be some items that are easier than others, part 1 of Day 15 seems maybe a bit more possible than part 2, just based on the loops that I wrote, etc.

  • joel_bourgault's avatar
    joel_bourgault
    New Contributor III

    Thanks for sharing @ddellsperger ! That’s inspiring, showing how SnapLogic paradigms and functions can be used to solve such problems.

  • Day 6 has the fun situation where you have to do a sliding window of data traversal. I’ve done this a FEW times in the past, but know that the typical easiest way to do it is to have an array generated with all of the potential start or end indexes available to you. Since the Sequence Snap only allows pipeline parameters, I have mine saved from last year of doing this where you pass a value to the pipe exec and it returns an array of the appropriate size based on the pipeline parameter. Here’s that pipeline

    Screenshot:

    SLP File:
    generate_sequence_array_2022_12_06.slp (4.7 KB)

    Okay now that we’ve cleared up the hardest part of this problem, my final solution for Day 6 was to use this generate_sequence_array to generate the array the size of the input data, and then generate the 4-character sequences, once I did that it provided the ability to do processing via a filter, group by field, then further filter with a final aggregate snap to get the proper answer. I only changed a few things between the two parts, but it was a generally tricky, but not impossible problem to do.

    Screenshot:

    SLP File:
    day06_2022_12_06.slp (35.4 KB)

    • ddellsperger's avatar
      ddellsperger
      Admin

      @tlikarish informed me of a way to avoid the sub-pipeline for generate_sequence_array by using the sl.range function from the expression language, huge revelation for me, and that will hopefully simplify future pipeline examples that I might need to make!

  • Day 9 started the fun of “to solve this problem, the logic is easy, but since we need continual state, it’s going to take a long time” advent of code solutions. My sample pipeline solution took 1 minute to run, and that was for a total of under 100 “head” movement. Then I saw the input file had 2,000 lines and realized this was going to be an over 1-hour process to get completed. My first iteration was able to process 1.7 documents per second (where each document is a 1-unit step in the puzzle) which my input file had 11,240 steps, so that took roughly 2 hours to complete (8 minutes short of 2 hours). I’m currently running with the setup you see in this post, I assume it will work properly, but I’m currently processing at 2.4-2.5 documents per second. Based on the new speed, it SHOULD complete in about 80 minutes. This is still a long time, but I’m still impressed that this is even possible in SOMEWHAT of a normal timeframe.

    The improvement in speed is that rather than having the tail movement pipeline read from the file and write to the file, I’m simply having it update the data in-place, with a pipeline parameter to define the index of the output array to actually process. This was previously done via passed-in parameter and file name for the index and filename where it was read, and then all processing done. That Disk IO really does impact the processing time for things, so these are the scenarios where it’d be nice to have something to do with state, etc. I would say that today’s puzzle is more of a practice in patience and really testing every step of a sub-pipeline that you may need to process. While testing is hard, you can see that I’ve just leveraged a JSON Generator that I disable and disconnect for final running for the internal values. I also just disable the pipeline calls to get the first state of things configured to begin with.

    Another interesting problem, and unfortunately a very long runtime to get the solution. Below are the screenshots from the runtime of the pipelines, then the pipelines themselves and SLP Files (I’ll edit/update to add the second runtime time screenshot)
    With sub-sub-pipeline reading/writing file (1 hour, 51 minutes):

    With sub-sub-pipeline just editing the structure itself (1 hour, 18 minutes):

    Screenshots:


    SLP Files:
    day09_2022_12_09.slp (32.4 KB)
    day09_move_2022_12_09.slp (38.3 KB)
    day09_tail_move_2022_12_09.slp (10.7 KB)