Forum Discussion

krupalibshah's avatar
krupalibshah
Contributor
8 years ago

Writing a forEach

Hello,

I have a use case where I am getting my input in tab delimited form, which after doing a groupBy N I am converting to an array so that I can make some comparison operation.

I need to extract the records from this array group for below mentioned conditions,

  1. All records between a line starting with ‘8’ and value of 5th column as 3 till line starting with ‘9’.
  2. All records between a line starting with ‘8’ and value of 5th column as 4 till line starting with ‘9’.
  3. All records between a line starting with ‘8’ and value of 5th column as 5 till line starting with ‘9’.

Wanted to write a forEach to achieve 3 arrays as mentioned above. After getting the 3 arrays I need to check if all value of 3rd col of array1 are present in value of 4th col of array2 and pic only the ones that are present.

For doing all these I wanted to get some help to be able to get the desired output in optimum way.

7 Replies

  • aditya_sharma's avatar
    aditya_sharma
    New Contributor III

    Can you post some sample input records and expected output from that input.

    • krupalibshah's avatar
      krupalibshah
      Contributor

      Input:

         6	1151	10	04222016	0	0	0	4.4	10000978	6	4	UNIX			ABC	Business.4.G.0011
          8	1151	10	04222016	4	0	0	4.4	10000978	6	4	UNIX				
          4	1111111110029069	01152014	14171	898	898	500000	898	0	000	000	840	000	000	000	000	000	000	000									35928	07222014	10062014															
          4	1111111110029150	01152014	14171	000	000	1500000	000	0	000	000	840	000	000	000	000	000	000	000									1476	07172014																
          4	1111111110029440	01152014	14171	000	000	500000	000	0	000	000	840	000	000	000	000	000	000	000									48916	06172014																																																			
          9	1151	10	04222016	16	2380	0	4.4	10000978	6	4	UNIX
          8	1151	10	04222016	3	0	0	4.4	10000978	6	4	UNIX				
          4	1111111110029069	01152014	14171	898	898	500000	898	0	000	000	840	000	000	000	000	000	000	000									35928	07222014	10062014															
          4	1111111110029150	01152014	14171	000	000	1500000	000	0	000	000	840	000	000	000	000	000	000	000									1476	07172014																
          4	1111111110029440	01152014	14171	000	000	500000	000	0	000	000	840	000	000	000	000	000	000	000									48916	06172014																
          4	1111111110029580	01152014	14171	000	000	500000	000	0	000	000	840	000	000	000	000	000	000	000									147909	06102015	10062014															
          4	1111111110029630	01152014	14171	000	000	000		0	000	000	840	000	000	000	000	000	000	000																										
          4	1111111110029770	01152014	14171	107398	122210	500000	122210	0	228	000	840	000	000	000	000	000	000	000									228	06182014	07062014															
          4	1111111110029879	01152014	14171	000	000	2500000	000	0	000	000	840	000	000	000	000	000	000	000									3488	06052014																
          9	1151	10	04222016	16	2380	0	4.4	10000978	6	4	UNIX				
          8	1151	10	04222016	31	0	0	4.4	10000978	6	4	UNIX				
          9	1151	10	04222016	31	0	0	4.4	10000978	6	4	UNIX				
          8	1151	10	04222016	32	0	0	4.4	10000978	6	4	UNIX				
          9	1151	10	04222016	32	0	0	4.4	10000978	6	4	UNIX				
          7	1151	10	04222016	0	141182	122410553	4.4	10000978	6	4	UNIX				Business.4.G.0011
      

      Output 2 arrays

      8	1151	10	04222016	4	0	0	4.4	10000978	6	4	UNIX				
      4	1111111110029069	01152014	14171	898	898	500000	898	0	000	000	840	000	000	000	000	000	000	000									35928	07222014	10062014															
      4	1111111110029150	01152014	14171	000	000	1500000	000	0	000	000	840	000	000	000	000	000	000	000									1476	07172014																
      4	1111111110029440	01152014	14171	000	000	500000	000	0	000	000	840	000	000	000	000	000	000	000									48916	06172014																																																			
      9	1151	10	04222016	16	2380	0	4.4	10000978	6	4	UNIX
      

      And

      8	1151	10	04222016	3	0	0	4.4	10000978	6	4	UNIX				
      4	1111111110029069	01152014	14171	898	898	500000	898	0	000	000	840	000	000	000	000	000	000	000									35928	07222014	10062014															
      4	1111111110029150	01152014	14171	000	000	1500000	000	0	000	000	840	000	000	000	000	000	000	000									1476	07172014																
      4	1111111110029440	01152014	14171	000	000	500000	000	0	000	000	840	000	000	000	000	000	000	000									48916	06172014																
      4	1111111110029580	01152014	14171	000	000	500000	000	0	000	000	840	000	000	000	000	000	000	000									147909	06102015	10062014															
      4	1111111110029630	01152014	14171	000	000	000		0	000	000	840	000	000	000	000	000	000	000																										
      4	1111111110029770	01152014	14171	107398	122210	500000	122210	0	228	000	840	000	000	000	000	000	000	000									228	06182014	07062014															
      4	1111111110029879	01152014	14171	000	000	2500000	000	0	000	000	840	000	000	000	000	000	000	000									3488	06052014																
      9	1151	10	04222016	16	2380	0	4.4	10000978	6	4	UNIX
      
      • aditya_sharma's avatar
        aditya_sharma
        New Contributor III

        Thanks krupalibshah for sharing the sample data. I would like to know how big is your input file ? Reason I am asking you this because in Snaplogic I am not sure if it is easily doable because in your case when you are making an array you need to keep track of last record too so that you know in which array you have to put the record. I was thinking if a script task can do this for you but it will not because script task works on single record. However, if your input file is not that big you can easily do this using any programming/scripting language like Python.

  • Suvigya's avatar
    Suvigya
    New Contributor

    Hi Aditya,

    Can you please let us know what is approx size limit permissible? Currently we have less clarity on the same.

    Thanks
    Suvigya.

    • aditya_sharma's avatar
      aditya_sharma
      New Contributor III

      Hi Suvigya,

      As I mentioned in my previous post, I am not sure how we can do this easily in Snaplogic, may be someone more expert in it will give you some good idea. I asked you about size because if it is not that big like few MB(s) or depending on the machine on which you are processing this file, you can write a simple program in any programming language, I would prefer Python because you can do complex things with few lines of code in it, you can achieve desired output very easily. Implementing your requirement is pretty easy in Python.

      Thanks
      Aditya

    • aditya_sharma's avatar
      aditya_sharma
      New Contributor III

      Hi Suvigya,

      I just read your another post regarding making multiple documents as one, though you have mentioned in this post too that you are using GroupByN but I didn’t know that with the help of GroupByN you can combine all the documents as one. So if you use a script snap after the group by N snap you can implement your logic either in Java/Python/JavaScript.

      Thanks
      Aditya