If I want to compare two consecutive documents in an ordered stream (compare the current to the previous or next), is the best way to do that with array indexing functions in mappers/conditionals?
What kind of comparison do you want to do? You’d probably need to do some kind of grouping on the documents upstream before you do the comparison.
I’m assuming the documents have the same structure but not the same values. I just want to see the differences in the values between two consecutive documents in a single stream.
By default SnapLogic’s Snaps don’t have multiple documents available at the same time (there are exceptions, for instance the unique snap, which compares one document with the last, and if identical, passes only one out). If you want to do that, you’d have to use a script snap, and read multiple documents, or build your own snap which does the same with java.
You could always do some convoluted copying and joining streams with n+1 as the lookup value, kinda jerky, but could achieve the same, fairly efficiently if you were not processing large volumes of data.
Here’s what I was thinking.
The GroupByN snap will group each 2 consecutive documents into a single document. Note this will group the first and second, third and fourth, and so on, but not the first and second, second and third, third and fourth. If you wanted to do a more complex grouping like the latter, you’d probably have to muck around with adding an index and joining.
From there, the example is pretty simple and shows how to use the Mapper to do some basic operations on the data.
Hopefully this is helpful.
Compare one row to the next, in streaming fashion with node code.
Take a look at this pipeline, it enables you compare an input document with the next, in a streaming fashion.
First thing I’m doing is adding the index to the input stream in the first mapper, using the following expression:
In the second mapper in the top part of the flow, I’m getting the index of the target row I want to join it with (TargetKey):
So that we don’t have the first row in the right-hand side of the Join, I skip the first document:
And then I join based on the Index and TargetKey
And in the output of the join, I see both documents in one, so I can compare the results:
CS new pipeline 1_2020_04_23.slp (8.8 KB)