Move data from files (SFTP/HDFS) into Redshift tables

Contributed by @pkona


This pipeline pattern allows users to move data available in files into a cloud data warehouse like Redshift. The pattern includes best practices for using configurable pipeline arguments and parallelized execution via the Pipeline Execute Snap. The pattern also uses bulk execute Snaps to efficiently load data into Redshift.

Pattern 2 - Step 1.0 - files to Redshift Parent

06%20AM

Source: Files uploaded to SnapLogic
Target: the Pattern 2 - Step 1.1 pipeline
Snaps used: Directory Browser, Mapper, Pipeline Execute

Configuration

Update the Pipeline Parameters with the following:

  • Path to the files within SnapLogic
  • The target table schema and database type
  • The source and target account names

Pattern 2 - Step 1.1 - Process a file and bulk load to Redshift

19%20AM

Source: Files located by the Pattern 2 - Step 1.0 pipeline
Target: Redshift
Snaps used: File Reader, CSV Parser, Redshift Bulk Load

Configuration

Update the Pipeline Parameters with the following:

  • Path to the files within SnapLogic
  • The target database account name and schema

Downloads

Pattern 2 - Step 1.0 - files to Redshift Parent.slp (6.2 KB)
Pattern 2 - Step 1.1 - Process a file and bulk load to Redshift.slp (5.5 KB)