Dynamic Table Loads

Question

Trying to do ETL to move source data to target db.  I don’t want to create a pipeline for every single table.  There are hundreds of tables.
To start I want it to do this simply and assume a truncate and reload of all the tables.  I will worry about incremental loads later.
I want to run a SQL Execute snap with this code to get a list of all the tables.
SELECT
TOP 1
s.name AS srcSchema,
t.name AS srcTbl,
‘mynewschema’ AS targSchema,
t.name AS targTbl
FROM
MyDb.sys.schemas s
JOIN MyDb.sys.tables t ON
s.schema_id = t.schema_id
WHERE
s.name in (‘dbo’)
ORDER BY
s.name,t.name
I want to then run the pipeline to go through the list of tables, truncate the target, select the source data, BCP the data into the target staging area.  Once there, I have a SQL Merge routine already built that will pull in the differences to our lake and I can easily call that with a SQL Execute.
I’m struggling to find an example of this in any documentation or online.
Creating hundreds of pipelines is not going to work.

matthias · Answer

If you can get the list of tables into documents (one document per table) - it should be pretty simple to build a (parametrized) pipeline for this, calling it for each document (each table) once.
You can also parallelize that with settings on the execute pipeline snap, assuming both source and destination support this.
Your child pipeline:
start with a SQL Select snap - which has a 2nd (optional) output view returning the table schema.
in our case, we convert the schema a little (uppercase column names, remove indexes) before ingesting it  with a load snap (in our case snowflake bulk load) - which again has 2 input views - the 2nd one being for the schema), and tick the “create table” checkbox in the load snap.

Forum Discussion

Dynamic Table Loads

1 Reply

Recent Discussions

Way to lock down in Prod org to "Monitor" only access?

trace API and proxy calls

Pagination Logic Fails After Migrating from REST GET to HTTP Client Snap

Pipeline Execute Pool size

Concat values of a field based on value of another field