Efficient pipeline design for reading a large amount of data from an on-premises RDBMS (JDBC) and write it as multiple files in S3

We have a use case where we need to read a large amount of data from an on-premises RDBMS (JDBC) and write it as multiple files in S3. No matter what, the data has to travel across the network from our on-premises data center to AWS. I’m looking for suggestions on the most efficient pipeline design.

Options we’ve considered:

A single pipeline that runs on on-premises snaplex, reads the data from the RDBMS and writes it to S3.
A single pipeline that runs on AWS snaplex, reads the data from the RDBMS and writes it to S3.
A parent/child pipeline where the parent runs on on-premises snaplex and reads the data from the RDBMS, then uses a Pipeline Execute snap to execute a pipeline on the AWS snaplex to write the data to S3. The data passes over the network to an unconnected input view in the child pipeline.

Forum Discussion

Efficient pipeline design for reading a large amount of data from an on-premises RDBMS (JDBC) and write it as multiple files in S3

Recent Discussions

Generate expression file from database query

Can we generate XML file in pretty print format using native snapLogic snaps?

401 error with HTTP Client and NTLM

Pagination and nextCursor in header

Javascript to promote top level lists