I have a Amazon S3 bucket containing multiple files that I’d like to extract and read into a database. The files are all .GZ (gzip) files. The file names will change each day but will all have the same format/contents once unzipped.
But this fails validation at JSON parser step, with:
Failure: Cannot parse JSON data, Reason: Unable to create json parser for the given input stream, Illegal character ((CTRL-CHAR, code 31)): only regular white space (\r, \n, \t) is allowed between tokens
After the S3 Reader step, I can preview the data and see the list of files that would be imported, but not the contents of the files themselves.
Any suggestions for the right way to read in the contents of several files in a S3 bucket at once? Thank you!