02-28-2017 07:56 AM
Sharding a table is a type of partitioning that separates very large table the into smaller parts for loading data in parallel into target system.
For Performance improvement the table was partitioned into multiple parts with Shard Offset snap and trigger parallel jobs for fast data transfer
1- Drag the SQL server execute snap on to the canvas and get the count of rows for a particular table.
2- Drag the Offset Snap and make changes to the Shard count setting. Step1 gets the total count of rows in a table and does a split by the Shard count. The O/P of the snap gives offset and the limit that are used to Query the DB table to fetch the rows.
3 - Use Pipeline Execute/For-each Snaps to send the offset and limit parameters to the child pipeline that queries the table for the data.
4 - In the Child Pipeline have the SQL Server execute snap query the table by taking the offset and limit as parameters.
These kind of scenarios mostly come up when you are loading huge volume of data into Target Systems.