cancel
Showing results for 
Search instead for 
Did you mean: 

Converting value of input schema to distinct columns

darshthakkar
Valued Contributor

Hi Team,

I’m reading a text file and wanted to fetch DISTINCT columns out of the data itself, screenshot below for your reference.

.txt file:
image

When I read the above data in snaplogic, the format is:
image

Tried with $.merge(), GroupByN, CSV parse (contains headers) and Aggregrate (Concat) but didn’t find any luck.

Any suggestions over here?
Thank you.

Regards,
Darsh

1 ACCEPTED SOLUTION

koryknick
Employee
Employee

Assuming it is always 8 properties, here is an example pipeline
Community 16940 - Converting value to distinct columns_2023_06_20.slp (6.3 KB)
Community-16940.txt (177 Bytes)

View solution in original post

10 REPLIES 10

darshthakkar
Valued Contributor

Thank you @koryknick, the solution worked.
I happen to tweak the expression in the mapper a bit so that no extra spaces were observed:

image

Yes, the text file will always have “x” fields, it wouldn’t even go to x-1 or x+1.

Moreover, as there is no max value in a group size as per documentation, I can pretty much keep it to 100 for instance to handle the additional rows coming in the future. What are your thoughts?
Intentionally, kept 100 and not 99999999 as I’m not anticipating anything more than 8.

Thank you.

Regards,
Darsh

koryknick
Employee
Employee

@darshthakkar - the reason I was asking about the number of fields is that the Group By N is giving me the set of fields per object. If you increase that to consume the entire document, then creating the object would fail because you wind up with duplicate names in the object.

But I’m glad this is working for you!

Got it, thank you for sharing the additional details on this.

darshthakkar
Valued Contributor

@koryknick - Is Group By N the only way to merge multiple JSON segments into one?
Can we do it with help of a mapper snap?

There are a handful of snaps that collect documents into fewer, such as:

  • Group By N
  • Group By Gate
  • Gate
  • Aggregate
  • Document to Binary
  • Formatter snaps

Each of these serve a different purpose and output differently. Some words of caution on the Document to Binary, Group By, and Gate snaps - These gather the complete input document set into an array for output. So if you have large documents and gather all (or large number) of them into a single document, you can theoretically run your JCC out of memory and cause a failure.

With that said, with your original question, I assumed that the “set of 8” fields you had shown would be repeated in the incoming file / feed. Assuming that the feed is non-repeating, then the Gate or Group By N set at 0 is an acceptable solution. But if the fields repeat within the feed, you will get an error when a field of the same name is encountered.

Hope that helps!