Forum Discussion

New Contributor

3 years ago

Solved

Ingesting multiple AWS S3 files into a database

I have a Amazon S3 bucket containing multiple files that I’d like to extract and read into a database. The files are all .GZ (gzip) files. The file names will change each day but will all have the same format/contents once unzipped.

I was thinking it would be like:

S3 browser → mapping → S3 reader → JSON Parser → database target

But this fails validation at JSON parser step, with:

Failure: Cannot parse JSON data, Reason: Unable to create json parser for the given input stream, Illegal character ((CTRL-CHAR, code 31)): only regular white space (\r, \n, \t) is allowed between tokens

After the S3 Reader step, I can preview the data and see the list of files that would be imported, but not the contents of the files themselves.

Any suggestions for the right way to read in the contents of several files in a S3 bucket at once? Thank you!

aws

file_transfer

json

alchemiz
3 years ago
Or if the zip file compose of multiple json files… you can then use the zip reader snap

S3 Browser → ZipFile Reader *(use the output $path from S3 Browser and set to the File) → JSON Parser → Prep Document (mapper) → To DB

or

Directory Browser → ZipFile Reader (use the output $Path from the Directory Browser and set File) → JSON Parser → Prep Document (mapper) → To DB

4 Replies

jee
Employee
3 years ago
Hi Doug,

After the S3 Reader, you could use a Decompress snap to decompress the files from GZIP.

Then you should be able to use the JSON Parser for all JSON files.

Let me know if that helps.

Best,
Jee
dnewton
New Contributor
3 years ago
Thank you! Getting a little closer. So my flow now looks like:

S3 browser → mapping → S3 reader → JSON Parser → CSV Formatter → File Writer

It validates OK, but a zero-byte file gets written (I’m outputting just to the snaplogic storage area).

If I just try to write out the files after decompressing, it too validates OK but gets Java errors at runtime.

S3 Browser → Mapper → S3 File Reader → Decompress → File Writer

Thank you for any additional tips you may have. 🙂
- alchemiz
  Contributor III
  3 years ago
  Or if the zip file compose of multiple json files… you can then use the zip reader snap
  
  S3 Browser → ZipFile Reader *(use the output $path from S3 Browser and set to the File) → JSON Parser → Prep Document (mapper) → To DB
  
  or
  
  Directory Browser → ZipFile Reader (use the output $Path from the Directory Browser and set File) → JSON Parser → Prep Document (mapper) → To DB
dnewton
New Contributor
3 years ago
Thank you!

Forum Discussion

Ingesting multiple AWS S3 files into a database

4 Replies

Recent Discussions

Javascript to promote top level lists

Google Sheets Subscribe questions

Basic string transformations not working

Can we generate XML file in pretty print format using native snapLogic snaps?

Multipart Reader failure - 'content-type' was not found