File Size Differences

Hi,
Why File Reader and File Writer size variation in the Pipeline?
I have make a simple pipeline to check my reader output file size and writer output file size and I have observed that reader and writer output size us not matching.
Could anyone please explain to me that.

@sshaikh: Is it a huge difference? Are you putting any transformations in your mapper snap?

Size of input file is 945 bytes in file reader and output in writer is 574 bytes. Yes did transformation in mapper snap but connected all input schema to output in transformation without any changes in the files.

That is expected right? as XML format uses nodes e.g.

<root><val>the quick brown fox jumps over the lazy dog</val></root>

while JSON format uses key/value pair e.g {“root”: {“val”: “the quick brown fox jumps over the lazy dog”} }

Or am I missing something?

Thanks,
EmEm

Why do you expect the size to be the same? You’re changing the format from XML to JSON. JSON is less verbose than XML, so if it contains the same data, I would expect it to be smaller.

2 Likes

@ptaylor
File Reader(Json File)>>Json Parser>>Json Formatter>>File Writter
If I read and write the file using above pipeline I am getting the reduced File Size at end.
Why there is Difference in file Size ?
Is there difference due to conversion from DOC to Binary and vice versa ?

@alchemiz can you please tell me why there is difference in file size by using below pipeline where I am reading and writing Json File.
File Reader(Json File)>>Json Parser>>Json Formatter>>File Writter

Hi @sshaikh,

Good day, try to see if your original file json formt are properly indented or perhaps in pretty-print mode this might be the reason for the difference in file size

For better understanding, try to open both files then compare the content… by comparing the files you will get your answer why there’s a difference in file size

Thanks,
EmEm

Ok, so the use case you’re asking about now changed to remove all XML and is only JSON?

I agree with @alchemiz. I’m guessing the bigger one is pretty-printed and the smaller one isn’t. Try checking the pretty print option on the Formatter and see how it compares. Even then, there are many ways to pretty print JSON so it probably won’t match exactly, but it will be closer.

Hi @alchemiz
Good day,I have observed the input and output format of file. There was, for input proper structured json file file was there but for output json structure in a single line. I think the space utilization in proper format is reduced in output so its size reduced.



Thanks @alchemiz.
Is there any way in Snaplogic by which we can get the same structured for output just like input structure?

Yes @ptaylor.
Now I am trying with input and output JSON file and compare both input and output structure.I think the space utilization in input was more than output so size is reduced as I shared screenshot to @alchemiz.
Am I right about it?

Thanks,
Sameer

Yes. Now try enabling Pretty-print in the Formatter:

@ptaylor Thank you.
By Pretty-print I got output same as input format style.