Forum Discussion

darshthakkar's avatar
darshthakkar
Valued Contributor
3 years ago
Solved

Converting value of input schema to distinct columns

Hi Team,

I’m reading a text file and wanted to fetch DISTINCT columns out of the data itself, screenshot below for your reference.

.txt file:

When I read the above data in snaplogic, the format is:

Tried with $.merge(), GroupByN, CSV parse (contains headers) and Aggregrate (Concat) but didn’t find any luck.

Any suggestions over here?
Thank you.

Regards,
Darsh

  • Hi,

    Use the below expression in a mapper. After that use the Splitter snap to get the details.
    

    $Systems.map(x => {“Systems” : x.substring(x.indexOf(“(”)+1, x.indexOf(“)”))})

10 Replies

  • If you can guarantee that these same 8 fields are always present, you could do a Group By N of 8, then process the array into an object. If missing values would remove the line, it becomes a bit more complex and may require a Script snap to handle the contextual parsing of the input.

    • darshthakkar's avatar
      darshthakkar
      Valued Contributor

      @koryknick - We can handle “n” no. of rows with Group Size as 0

      I will modify my 50 to 0 now as 0 will handle all the incoming rows.

  • @darshthakkar - the reason I was asking about the number of fields is that the Group By N is giving me the set of fields per object. If you increase that to consume the entire document, then creating the object would fail because you wind up with duplicate names in the object.

    But I’m glad this is working for you!

  • Desired Output:

    Name: Darsh
    City: Toronto
    Province-State: Ontario
    Population: 126723

    Country: Canada
    IPV4: cc1d3198da257e12dffff507be1faeea
    Phone: 70025416
    Total Population: 126723

    where Name, City, etc. are columns and Darsh, Toronto, etc. are values.

  • Thank you @koryknick, the solution worked.
    I happen to tweak the expression in the mapper a bit so that no extra spaces were observed:

    Yes, the text file will always have “x” fields, it wouldn’t even go to x-1 or x+1.

    Moreover, as there is no max value in a group size as per documentation, I can pretty much keep it to 100 for instance to handle the additional rows coming in the future. What are your thoughts?
    Intentionally, kept 100 and not 99999999 as I’m not anticipating anything more than 8.

    Thank you.

    Regards,
    Darsh

    • koryknick's avatar
      koryknick
      Employee

      There are a handful of snaps that collect documents into fewer, such as:

      • Group By N
      • Group By Gate
      • Gate
      • Aggregate
      • Document to Binary
      • Formatter snaps

      Each of these serve a different purpose and output differently. Some words of caution on the Document to Binary, Group By, and Gate snaps - These gather the complete input document set into an array for output. So if you have large documents and gather all (or large number) of them into a single document, you can theoretically run your JCC out of memory and cause a failure.

      With that said, with your original question, I assumed that the “set of 8” fields you had shown would be repeated in the incoming file / feed. Assuming that the feed is non-repeating, then the Gate or Group By N set at 0 is an acceptable solution. But if the fields repeat within the feed, you will get an error when a field of the same name is encountered.

      Hope that helps!

  • Thank you for sharing the additional details, @koryknick.
    Yup, my feed will only have 8 rows so all good over there but I was exploring the documentation, thought of bringing it to your notice so that you can help me understand it better.

    As always, thank you so much for the additional details and tips.