User Profile

pkona

Former Employee

Joined 9 years ago

26 Posts7 Likes

View All Badges

User Widgets

Contributions

Re: SnapLogic Integration with Microsoft Azure
Here is the link to the current documentation on Microsoft on installing the sqltools on linux. Its pretty straight forward docs.microsoft.com Install SQL Server command-line tools on Linux - SQL Server Learn how to install the SQL Server command-line tools, Microsoft ODBC drivers, and their dependencies on Linux. This can be installed on any SnapLogic Groundplex
7 years ago Place Designing and Running Pipelines
1.1KViews
0likes
1Comment
Re: Reference Implementation - Integration Pattern - On-Prem Database to Cloud Datawarehouse
The oracle extract uses the limit offset to help with shrading the data. If you plan to ignore shrading the data then you can easily convert the existing pattern to support MSSQL and target as Snowflake… Set shrading as no in the pipeline parameters on the master pipeline. Here is an example of a pattern that might also help that gets files on s3 and insert into snowflake . Move data from AWS S3 to Snowflake Data Warehouse Data Warehousing Patterns Created by @pkoppishetty The pipeline pattern moves data from AWS S3 to Snowflake Data Warehouse. The pattern contains a Directory Browser (to retrieve the file list from S3), Snowflake Execute, and Snowflake Bulkload Snaps (to load the corresponding files into Snowflake). [pipeline%20design_S3_to_Snowflake%20-%20Prasanna%20Koppishetty] Configuration File Reader Preview [datainS3] Snowflake Bulkload Output [snowflakebulkloadoutput] Table Data from Snowflake Select Snap [Tabledatafromsn…
7 years ago Place Patterns
3.3KViews
0likes
4Comments
Re: Creating APIs with SnapLogic Pipelines and Ultra Tasks
Do we have sample pipelines to demonstrate “2”
7 years ago Place Designing and Running Pipelines
4.8KViews
0likes
1Comment
Re: SnapLogic Integration with Microsoft Azure
The snap requires a field " BCP absolute path" that needs the Absolute path of the BCP utility program in JCC’s file system. If empty, the Snap will look for it in JCC’s environment variable PATH.
7 years ago Place Designing and Running Pipelines
2.4KViews
0likes
4Comments
Change the case of all attributes in the document to upper or lower case
Created by @pkona This pipeline pattern uses the Script Snap to change the case of all attributes in the document to upper or lower case. Configuration Sources: JSON Targets: JSON Snaps used: JSON Generator, Script Downloads Pattern - Change case of attributes in document.slp (11.6 KB)
8 years ago Place Patterns
case
expression_language
sample_pipeline
script
3.6KViews
0likes
0Comments
Call a command line utility on the Snaplex node
Created by @pkona This pipeline pattern uses the Script Snap (here labeled Execute Script Snap) to call a command line utility on the Snaplex node. This pipeline calls any allowed shell command on the Snaplex node and the command is executed as the Snap user that is running the JCC process. In this sample the command can be configured in the Mapper Snap. Configuration Sources: Python Script Targets: JSON File Snaps used: Mapper, Script, JSON Formatter, File Writer Downloads Pattern to call-shell-command using Script Snap on Snaplex node.slp (6.8 KB)
8 years ago Place Patterns
sample_pipeline
script
5.5KViews
1like
2Comments
Ingest data from Database (Oracle) into the Hadoop Data Lake (HDFS+Hive) using the AVRO as the Data format
Created by @pkona This pattern ingests all the tables present in an Oracle database into the Hadoop Data Lake. This pattern will achieve the following: Automatically creates Avro schemas and Hive tables Automatically shards large tables Ingests data into HDFS+Hive tables All the pipeline execution is done to maximize parallelized execution of the child pipelines across multiple nodes in the Snaplex Pipelines in this Patter Pattern 3 - Step 1.0 - Oracle to Hive Parent The master pipeline : Gets a list of all source tables. Calls a child pipeline (Pattern 3 - Step 1.1 - Ingest - Generate Hive table Avro based on Oracle Table metadata) to create schemas and tables on Hive. Calls a child pipeline for a table (Pattern 3 - Step 1.2 - Oracle to Hive No Sharding) The Pipeline Execute controls the number of tables to be processed in parallel. Pattern 3 - Step 1.1 - Ingest - Generate Hive table Avro based on Oracle Table metadata Creates schemas required for file data format Creates Hive table Returns folder location used by Hive Table Uses pipelines Pattern 3 - Core - Write schema to HDFS and Write Hive Table DDL to HDFS Pattern 3 - Step 1.2 - Oracle to Hive No Sharding Processes data for a single table If the source table is small, writes a single file. If table is large, calculate shard size and call child pipeline to process a shard The Pipeline Execute controls the number of shards to be processed in parallel (Pattern 3 - Step 1.2.1 Oracle to Hive - process a shard) Pattern 3 - Step 1.2.1 Oracle to Hive - process a shard Processes a shard Queries source database for a shard Writes a file for the data Each shard has its own file under the Hive table’s folder Configuration Sources: Oracle Targets: HDFS, Hive, AVRO Snaps used: Oracle, HDFS, Hive, JSON Generator Downloads Project of all Pipelines Community Patterns_DW_ DB to Hadoop DL.zip (11.6 KB) Individual Pipelines Pattern 3 - Step 1.0 - Oracle to Hive Parent_2018_08_24.slp (14.7 KB) Pattern 3 - Step 1.1 - Ingest - Generate Hive table Avro based on Oracle Table metadata_2018_08_24.slp (22.0 KB) Pattern 3 - Step 1.2 - Oracle to Hive No Sharding_2018_08_24.slp (15.3 KB) Pattern 3 - Step 1.2.1 Oracle to Hive - process a shard_2018_08_24.slp (9.8 KB) Pattern - Core - Query Hive Table Metadata_2018_08_24.slp (5.1 KB) Pattern 3 - Core - Write schema to HDFS_2018_08_24.slp (8.4 KB)
8 years ago Place Patterns
database
hadoop
sample_pipeline
5.2KViews
1like
2Comments
Determine changes between two data files
Created by @pkona This pipeline pattern determines changes between two data files (inserts, updates, deletes, no-changes, older-records-for-updates). This is a standard mode pipeline pattern that illustrates how to find and separate changes like inserts, updates, deletes, no-changes, older-records-for-updates into separate files. This pattern is best suited to handle small to medium files (< 50 million records). For larger files > 5GB or 100 Million records, consider using Spark mode pipelines for optimal performance. Configuration Velocity templates help generate code, schemas, and text. Sources: Files Targets: Files Snaps used: CSV Generator, JSON Generator, Head, CSV Formatter, CSV Parser, Mapper, Copy, Script, JSON Formatter, File Writer Downloads Generate_csvSchema4Spark.slp (26.7 KB)
8 years ago Place Patterns
sample_pipeline
spark
2.5KViews
0likes
0Comments
Generate Avro Schema and DDL (to create Hive Table) based on Oracle Table Metadata
Created by @pkona Integrations involving ingesting data into Hive requires creating tables on Hive and creating the schema for data formats like Avro. This pattern generates the AVRO schema based on querying the table metadata for a Oracle table. It also generates the create table statement for Hive. Both the schema file and DDL file are written to HDFS, SnapLogic Database and returned as output of the pipeline. Configuration Sources: Oracle Table Targets: Generate Hive table create statement and AVRO schema Snaps used: JSON Generator, Oracle Execute, Hive Execute, Mapper, Copy, Join, JSON Formatter, File Writer, Pipeline Execute Downloads Pattern 3 - Step 1.1 - Ingest - Generate Hive table Avro based on Oracle Table metadata.slp (22.0 KB) Pattern 3 - Core - Write schema to HDFS.slp (8.2 KB) (child pipeline called by the ingest pipeline)
8 years ago Place Patterns
avro
hive
oracle
sample_pipeline
6KViews
0likes
1Comment
Ingest large number of small files as a sequence file and store file metadata relating to each file
Created by @pkona This pipeline pattern ingests a large number of small files as a sequence file and stores file metadata relating to each file. If you have a number of small files on SPTP/S3/HDFS/Kafka, you may want to store these files and their metadata into a larger file for efficient storage. You can use this pattern if you have multiple small files each relating to a certain event, such as a few hundred of small files generated in an hour. This pattern retrieves the files, extracts the metadata and ingests as a single sequence file that contains zipped source files and file metadata. The files can then be written to any target location using the File Writer Snap including S3, WASB, ADL, HDFS, SFTP, and other protocols. Configuration Sources: file Targets: sequence file Snaps used: Directory Browser, File Reader, Compress, Binary to Document, Mapper, Sequence Formatter, HDFS Writer Downloads IngestPattern - Ingest multiple small files as SequenceFile.slp (11.4 KB)
8 years ago Place Patterns
ingest
sample_pipeline
sequence
2.5KViews
0likes
0Comments