Forum Discussion
Example of the configuration file:
Table_1, (Col A, Col B, Col C)
Table_2, (Col D, Col E, Col A)
Table_3, All
Here All means all the columns for Table_3 so in the select query it should be replaced with ‘*’. Col_A i have kept it in both Table_1 and Table_2 because some columns might be common working as foreign key to another table.
Out put will be <Table_Name>_OUT.csv/Amazon S3, Table_2.csv/Amazon S3 etc.
I was doing it using one pipeline and kind of works only problem i am facing is the output part whether instead of creating separate files it is writing the out put in the same file. In the above example if i am running my pipeline with first two tables then it is writing the header for the first file and all it’s data correctly but for the second table it is writing the data in the same file without the header row and row values are starting after putting just the delimiters of for the first table…example below.
Current output: Only 1 file with name Table_1.csv
Col A, Col B, Col C
1, Rakesh, 1140 easton avenue
,1234,4567,1
Expected output: 2 file with names Table_1.csv and Table_2.csv
Col A, Col B, Col C
1, Rakesh, 1140 easton avenue
Col D, Col E, Col A
1234,4567,1
- mile_grujovski6 years agoNew Contributor
As suggested by Anubhav
- Create one main pipeline that will read the table name and column names from csv file
- Create one sub pipeline that will be called from the main pipeline. Child pipeline will have two parameters (tableName and columnNames). This subpipeline will execute SQL for the given table and columns and will write the result to csv
Main pipeline:
Sub-pipeline
File-writer