02-12-2018 12:56 AM
Hello,
I have a use case where I am getting my input in tab delimited form, which after doing a groupBy N I am converting to an array so that I can make some comparison operation.
I need to extract the records from this array group for below mentioned conditions,
Wanted to write a forEach to achieve 3 arrays as mentioned above. After getting the 3 arrays I need to check if all value of 3rd col of array1 are present in value of 4th col of array2 and pic only the ones that are present.
For doing all these I wanted to get some help to be able to get the desired output in optimum way.
02-12-2018 07:25 AM
Can you post some sample input records and expected output from that input.
02-13-2018 12:59 AM
Input:
6 1151 10 04222016 0 0 0 4.4 10000978 6 4 UNIX ABC Business.4.G.0011
8 1151 10 04222016 4 0 0 4.4 10000978 6 4 UNIX
4 1111111110029069 01152014 14171 898 898 500000 898 0 000 000 840 000 000 000 000 000 000 000 35928 07222014 10062014
4 1111111110029150 01152014 14171 000 000 1500000 000 0 000 000 840 000 000 000 000 000 000 000 1476 07172014
4 1111111110029440 01152014 14171 000 000 500000 000 0 000 000 840 000 000 000 000 000 000 000 48916 06172014
9 1151 10 04222016 16 2380 0 4.4 10000978 6 4 UNIX
8 1151 10 04222016 3 0 0 4.4 10000978 6 4 UNIX
4 1111111110029069 01152014 14171 898 898 500000 898 0 000 000 840 000 000 000 000 000 000 000 35928 07222014 10062014
4 1111111110029150 01152014 14171 000 000 1500000 000 0 000 000 840 000 000 000 000 000 000 000 1476 07172014
4 1111111110029440 01152014 14171 000 000 500000 000 0 000 000 840 000 000 000 000 000 000 000 48916 06172014
4 1111111110029580 01152014 14171 000 000 500000 000 0 000 000 840 000 000 000 000 000 000 000 147909 06102015 10062014
4 1111111110029630 01152014 14171 000 000 000 0 000 000 840 000 000 000 000 000 000 000
4 1111111110029770 01152014 14171 107398 122210 500000 122210 0 228 000 840 000 000 000 000 000 000 000 228 06182014 07062014
4 1111111110029879 01152014 14171 000 000 2500000 000 0 000 000 840 000 000 000 000 000 000 000 3488 06052014
9 1151 10 04222016 16 2380 0 4.4 10000978 6 4 UNIX
8 1151 10 04222016 31 0 0 4.4 10000978 6 4 UNIX
9 1151 10 04222016 31 0 0 4.4 10000978 6 4 UNIX
8 1151 10 04222016 32 0 0 4.4 10000978 6 4 UNIX
9 1151 10 04222016 32 0 0 4.4 10000978 6 4 UNIX
7 1151 10 04222016 0 141182 122410553 4.4 10000978 6 4 UNIX Business.4.G.0011
Output 2 arrays
8 1151 10 04222016 4 0 0 4.4 10000978 6 4 UNIX
4 1111111110029069 01152014 14171 898 898 500000 898 0 000 000 840 000 000 000 000 000 000 000 35928 07222014 10062014
4 1111111110029150 01152014 14171 000 000 1500000 000 0 000 000 840 000 000 000 000 000 000 000 1476 07172014
4 1111111110029440 01152014 14171 000 000 500000 000 0 000 000 840 000 000 000 000 000 000 000 48916 06172014
9 1151 10 04222016 16 2380 0 4.4 10000978 6 4 UNIX
And
8 1151 10 04222016 3 0 0 4.4 10000978 6 4 UNIX
4 1111111110029069 01152014 14171 898 898 500000 898 0 000 000 840 000 000 000 000 000 000 000 35928 07222014 10062014
4 1111111110029150 01152014 14171 000 000 1500000 000 0 000 000 840 000 000 000 000 000 000 000 1476 07172014
4 1111111110029440 01152014 14171 000 000 500000 000 0 000 000 840 000 000 000 000 000 000 000 48916 06172014
4 1111111110029580 01152014 14171 000 000 500000 000 0 000 000 840 000 000 000 000 000 000 000 147909 06102015 10062014
4 1111111110029630 01152014 14171 000 000 000 0 000 000 840 000 000 000 000 000 000 000
4 1111111110029770 01152014 14171 107398 122210 500000 122210 0 228 000 840 000 000 000 000 000 000 000 228 06182014 07062014
4 1111111110029879 01152014 14171 000 000 2500000 000 0 000 000 840 000 000 000 000 000 000 000 3488 06052014
9 1151 10 04222016 16 2380 0 4.4 10000978 6 4 UNIX
02-13-2018 06:36 AM
Thanks krupalibshah for sharing the sample data. I would like to know how big is your input file ? Reason I am asking you this because in Snaplogic I am not sure if it is easily doable because in your case when you are making an array you need to keep track of last record too so that you know in which array you have to put the record. I was thinking if a script task can do this for you but it will not because script task works on single record. However, if your input file is not that big you can easily do this using any programming/scripting language like Python.
02-15-2018 02:10 PM
Actually, a script task doesn’t really work on a single record. It is simply that the example has you read in one record at a time.
For example, I had to create a special CSV script to replace the CSV script that snaplogic comes with. That script reads in, and keeps, the first row as a header, and reads the others using values from the first row. That first row could easily have been a thousand, and joining with the first row could have been processing all the data in the snap.
The size limit is based on your system, and various things like how soon you want it. Unfortunately, I think a lot of memory is not freed until you finish, though I could be wrong. Of course you should try to limit data requirements.
Basically, the snap has five processing points.
And you can write out the records to any area as you want on your conditions and timing.
Steve