Forum Discussion

Valued Contributor

3 years ago

Solved

Converting value of input schema to distinct columns

Hi Team, I’m reading a text file and wanted to fetch DISTINCT columns out of the data itself, screenshot below for your reference. .txt file: When I read the above data in snaplogic, the f...

aditya_gupta41

4 years ago

j.angelevski:

{}.extend(...$data.entries().map(val =>
	match typeof val[1] {
		"object" =>
				{
					[val[0]]: val[1].mapValues((v, k) => k.contains("_DATE") && v == null ? "0000-00-00" : v)
				},
		"array" =>
				{
					[val[0]]: val[1].map(v => {}.extend(...v.entries().map(date => date[0].contains("_DATE") && date[1] == null ? {[date[0]]: "0000-00-00"} : {[date[0]]: date[1]} )))
				}
	}
))

This works. Thanks for your help @j.angelevski

darshthakkar

Valued Contributor

3 years ago

Thank you @koryknick, the solution worked.
I happen to tweak the expression in the mapper a bit so that no extra spaces were observed:

Yes, the text file will always have “x” fields, it wouldn’t even go to x-1 or x+1.

Moreover, as there is no max value in a group size as per documentation, I can pretty much keep it to 100 for instance to handle the additional rows coming in the future. What are your thoughts?
Intentionally, kept 100 and not 99999999 as I’m not anticipating anything more than 8.

Thank you.

Regards,
Darsh

bojanvelevski
Valued Contributor
3 years ago
Hello @aditya.gupta41,

Are those batches managed by the Pipeline Execute snap, or with a Group By snap ?
If you’re already using a Group By snap, you can simply add a parameter in the Pipeline Execute with a snap.in.totalCount function. That function will count the incoming batches and pass that value on the underlying pipeline.

Then simply construct the filename in the file writer, by adding the parameter :

"TEST_" + _parameter + ".txt"
- aditya_gupta41
  Contributor
  3 years ago
  Hello,
  
  The batches is managed by Pipeline Execute.
  - bojanvelevski
    Valued Contributor
    3 years ago
    That complicates things a little bit. You can keep track of processed files in a file, so the next execution will know what number to use. But that will significantly slow down your pipeline 🙂
    
    If there are no concurrent executions of the pipeline, you can simply write down a file with the sequence number used, and set that file as a library expression on the child pipeline itself, so you could easily set the next value:
    
    Example expression in the file writer: "TEST_" + lib.config.sequence + 1 + ".txt"

Forum Discussion

Converting value of input schema to distinct columns

Related Content

multiple array objects

How to split a value into multiple fields

Multiple Postgres DB

Execute each child pipeline multiple times

Unable to handle echo message

Recent Discussions

Automating Untracked assets to GIT

Pagination and nextCursor in header

How to get filename from file reader

Generate expression file from database query

Can we generate XML file in pretty print format using native snapLogic snaps?