Create and load table for different files from same pipeline

Question

Hi
I have multiple different files with different structure. My pipeline should read each file, identify the column names, column length and data type and create the table in given db connection. Once table is created it should be loaded with data present in the incoming file.
How can I achieve this. Please help.

tstack · Answer

yashsri:

I have multiple different files with different structure.

What are the file formats (e.g. CSV, JSON, XML, …)?

yashsri:

My pipeline should read each file, identify the column names, column length and data type and create the table in given db connection. Once table is created it should be loaded with data present in the incoming file.

Do you need to do any transformations to the values before insertion?
The DB Insert snaps (e.g. MySQL Insert) have an option to create the table if it is not present.  I think the table creation is done based on the values in the first record to be, so string properties will be VARCHAR columns in the DB, numbers will be a number column, and so on.  Is that sufficient or are you looking for more?

aleung · Answer

Auto create table with the perspective data typing and field length dynamically isn’t something you can easily do today.  My experience with Redshift so far is Snap would default all fields to varchar and having max length of 32000 which isn’t ideal.  What you can do is perform a dynamic max(length([field]) via SQL to obtain the length the field should be then try to port the data from the original table to the ideal one.

Forum Discussion

Create and load table for different files from same pipeline

Recent Discussions

Javascript to promote top level lists

Google Sheets Subscribe questions

Basic string transformations not working

Can we generate XML file in pretty print format using native snapLogic snaps?

Multipart Reader failure - 'content-type' was not found