Salesforce Contacts - Database Replication (SQL Server-MongoDB-Oracle)
Created by @asharifian This pattern provides an example pipeline to replicate data from Salesforce to multiple different database types: SQL Server, MongoDB, and Oracle. Data replication is useful for disaster recovery, analytics, and improves overall system resilience and reliability. Configuration Two approaches are covered in this pattern: Data replication post success of an insert into a dependent database, such as data first getting inserted into MongoDB, and upon success, inserted into Oracle. Data replication in parallel, such as data being inserted to MongoDB and SQL Server in a parallel fashion. Sources: Salesforce Contacts Targets: Database tables for customer contact information in SQL Server, MongoDB, Oracle Snaps used: Salesforce Read, Mapper, Copy, MongoDB - Insert, SQL Server - Insert, Oracle - Insert Downloads Salesforce Contacts-Database Replication (SQL Server-MDB-Oracle).slp (14.5 KB)2.9KViews0likes0CommentsIngest data from Database (Oracle) into the Hadoop Data Lake (HDFS+Hive) using the AVRO as the Data format
Created by @pkona This pattern ingests all the tables present in an Oracle database into the Hadoop Data Lake. This pattern will achieve the following: Automatically creates Avro schemas and Hive tables Automatically shards large tables Ingests data into HDFS+Hive tables All the pipeline execution is done to maximize parallelized execution of the child pipelines across multiple nodes in the Snaplex Pipelines in this Patter Pattern 3 - Step 1.0 - Oracle to Hive Parent The master pipeline : Gets a list of all source tables. Calls a child pipeline (Pattern 3 - Step 1.1 - Ingest - Generate Hive table Avro based on Oracle Table metadata) to create schemas and tables on Hive. Calls a child pipeline for a table (Pattern 3 - Step 1.2 - Oracle to Hive No Sharding) The Pipeline Execute controls the number of tables to be processed in parallel. Pattern 3 - Step 1.1 - Ingest - Generate Hive table Avro based on Oracle Table metadata Creates schemas required for file data format Creates Hive table Returns folder location used by Hive Table Uses pipelines Pattern 3 - Core - Write schema to HDFS and Write Hive Table DDL to HDFS Pattern 3 - Step 1.2 - Oracle to Hive No Sharding Processes data for a single table If the source table is small, writes a single file. If table is large, calculate shard size and call child pipeline to process a shard The Pipeline Execute controls the number of shards to be processed in parallel (Pattern 3 - Step 1.2.1 Oracle to Hive - process a shard) Pattern 3 - Step 1.2.1 Oracle to Hive - process a shard Processes a shard Queries source database for a shard Writes a file for the data Each shard has its own file under the Hive table’s folder Configuration Sources: Oracle Targets: HDFS, Hive, AVRO Snaps used: Oracle, HDFS, Hive, JSON Generator Downloads Project of all Pipelines Community Patterns_DW_ DB to Hadoop DL.zip (11.6 KB) Individual Pipelines Pattern 3 - Step 1.0 - Oracle to Hive Parent_2018_08_24.slp (14.7 KB) Pattern 3 - Step 1.1 - Ingest - Generate Hive table Avro based on Oracle Table metadata_2018_08_24.slp (22.0 KB) Pattern 3 - Step 1.2 - Oracle to Hive No Sharding_2018_08_24.slp (15.3 KB) Pattern 3 - Step 1.2.1 Oracle to Hive - process a shard_2018_08_24.slp (9.8 KB) Pattern - Core - Query Hive Table Metadata_2018_08_24.slp (5.1 KB) Pattern 3 - Core - Write schema to HDFS_2018_08_24.slp (8.4 KB)5.2KViews1like2CommentsMove data from a relational database into Teradata
Created by @pavan This pattern pipeline migrates data from a relational database, like MySQL, to Teradata. Configuration Note: Sample database and table names are used within the Snaps to give you an idea on expected values. Replace them with your system information. Sources: MySQL table Targets: Teradata Snaps used: MySQL Select, TPT Load, Teradata Execute Downloads 4.8_TPT FLoad.slp (5.3 KB)2KViews0likes0Comments