Data Integration

Discuss data integration use cases and solutions from ETL, ELT, reverse ETL, and AutoSync.

Forum Widgets

Recent Discussions

AWS SageMaker Model Integration
I am using Snaplogic to create a data set that I then write as a CSV to S3. My next step is to make a call to the SageMaker model that reads the data and writes an output file to S3. I am currently not able to execute the SageMaker model. I am attempting to use HTTP Client snap AWS Signature V4 Account Is there anything special that you did to the user account or SageMaker? Here is a screen shot of the Snap.
Solved
hawkeye412
New Contributor II
3 months ago
48Views
0likes
1Comment
REST Get Pagination in various scenarios
Hi all, There are various challenges witnessed while using REST GET pagination. In this article, we can discuss about these challenges and how to overcome these challenges with the help of some in-build expressions in snaplogic. Let's see the various scenarios and their solution. Scenario 1: API URL response has no total records indicator, but works with limit and offset: In this case, as there are no total records that the API is going to provide in advance the only way is to navigate each of the page until the last page of the API response. The last page of the API is the page where there are no records in the response output. Explanation of how it works and Sample data: has_next condition: $entity.length > 0 has_next explanation: If the URL has n documents and it is not sure if the next page iteration is valid, the function $entity.length will check the response array length from the URL output and proceeds with the next page iteration only when the $entity.length is greater than zero. If the response array length is equal to zero, it’s evident that there are no more records to be fetched and hence the condition on has_next “$entity.length > 0” will fail and stops the next iteration loop. next_url condition: $original.URL+"?limit=" + $original.limit + "&offset=" + ( parseInt($original.limit) * snap.out.totalCount ) next_url explanation: Limit (limit parameter) and API URL values are static, but the offset value will need to change for each iteration. Hence the approach is to multiply the default limit parameter (limit) with the snap.out.totalCount function to shift the offset per API page iteration. snap.out.totalCount is the snap system variable which used to hold the total number of documents that have passed through output views of the snap. In this “REST Get”, each API page iteration response output is one json array and hence the snap.out.totalCount will be equal to the number of API page iteration completed Sample response: For First API call: { "statusLine": { "protoVersion": "HTTP/1.1", "statusCode": 200, "reasonPhrase": "OK" }, "entity": [ { "year": "2022", "month": "08", "Name": "Mark", }, { "year": "2022", "month": "08", "Name": "John", },………………. 1000 records in this array ], "original": { "effective_date": "2023-08-31", "limit": "1000", "offset": "0",       "URL": "https://Url.XYZ.com" } } For Second API Call: { "statusLine": { "protoVersion": "HTTP/1.1", "statusCode": 200, "reasonPhrase": "OK" }, "entity": [ { "year": "2024", "month": "08", "Name": "Ram", }, { "year": "2021", "month": "03", "Name": "Joe", },………………. 1000 records in this array ], "original": { "effective_date": "2023-08-31", "limit": "1000", "offset": "1000"       "URL": "https://Url.XYZ.com" } } Scenario 2: API URL response has total records in the response header and pagination is using limit & offset: As there are total records, the total records column in the API response can be used to traverse through the API response pages. Explanation on how it works and Sample data: has_next condition: parseInt($original.limit) * snap.out.totalCount < $headers[total-records] has_next condition explanation: If the URL Response has n documents where n is equal to total, there needs a check whether the limit is less than total records, for example: if there were 120 total records and 100 as a limit, it loops through only 2 times. It loops through as below, limit = 100, snap.out.totalCount =0: has_next condition will evaluate 0 < 120 limit = 100, snap.out.totalCount =1 has_next condition will evaluate 100 < 120 limit = 100, snap.out.totalCount =2 has_next condition will evaluate 200 < 120 pagination breaks and next page is not processed next_url condition: $original.URL+"?limit=" + $original.limit + "&offset=" + (parseInt($original.limit)* snap.out.totalCount) next_url Explanation: Limit and url values are static, but the offset value need to be derived as limit multiplied with snap.out.totalCount function. snap.out.totalCount indicates the total number of documents that have passed through all of the Snap's output views. So it will traverse next API page until the has_next condition is satisfied Sample Response:  For First API call: { "statusLine": { "protoVersion": "HTTP/1.1", "statusCode": 200, "reasonPhrase": "OK" }, "entity": [ { "year": "2022", "month": "08", "Name": "Mark" }, { "year": "2022", "month": "08", "Name": "John" },….....100 records ], "original": { "effective_date": "2023-08-31", "limit": "100", "offset": "0",       "URL": "https://Url.XYZ.com" }     "headers": { "total-records": [ "120" ] } } For Second API Call: { "statusLine": { "protoVersion": "HTTP/1.1", "statusCode": 200, "reasonPhrase": "OK" }, "entity": [ { "year": "2022", "month": "08", "Name": "Ram" }, { "year": "2022", "month": "08", "Name": "Raj" },….....20 records ], "original": { "effective_date": "2023-08-31", "limit": "100", "offset": "100",       "URL": "https://Url.XYZ.com" }     "headers": { "total-records": [ "120" ] } } Scenario 3: API has no total records indicator and pagination is using page_no: The scenario here is that, there is no total records indication in the API output but API has page number as parameter. So the API pagination is possible by incrementing the page number parameter by 1 until the length of the API output array length is greater than 0, else the pagination loop need to break. Explanation on how it works and Sample data: Has-next condition: $entity.length > 0 Has-next Condition Explanation: As there is no total record count known from API output, next page of the API need to be fetched if the current page has any output elements in the output array. next-url condition: $original.URL+"&page_no= " + $headers.page_no+1 Next-Url Condition Explanation: As every document has page number in it, same can be used in the has-next condition. Sample Response:  For First API call: { "statusLine": { "protoVersion": "HTTP/1.1", "statusCode": 200, "reasonPhrase": "OK" }, "entity": [ { "year": "2022", "month": "08", "Name": "Mark", }, { "year": "2022", "month": "08", "Name": "John", },………………. 1000 records in this array ], "original": { "effective_date": "2023-08-31",       "URL": "https://Url.XYZ.com" },     "headers": { "page_no": 1 } }  For Second API call: { "statusLine": { "protoVersion": "HTTP/1.1", "statusCode": 200, "reasonPhrase": "OK" }, "entity": [ { "year": "2022", "month": "08", "Name": "Ram", }, { "year": "2022", "month": "08", "Name": "Raj", },………………. 1000 records in this array ], "original": { "effective_date": "2023-08-31",       "URL": "https://Url.XYZ.com" },     "headers": { "page_no": 2 } } Scenario 4: has total records in the response header and pagination is using page_no The scenario is there is a total records count indicator and page number in the API Url response. API next page traverse can be through incrementing page number by 1 and validate if the total records count is less than the total rows fetched so far (multiplication of snap.out.totalCount and page limit). Explanation on how it works and Sample data: Has_next condition: parseInt($original.limit) * snap.out.totalCount < $headers[total-records] Has-next Explanation: If the URL Response has n documents where n is equal to total, has_next condition is to check whether the rows fetched is less than total records, For example: if there were have 120 total records and 100 as the limit factor for the API (predefined as part of design/implementation), it loops through exactly 2 times (first and second page only). it loops through as below, limit = 100, snap.out.totalCount =0: has_next condition will evaluate 0 < 120 limit = 100, snap.out.totalCount =1 has_next condition will evaluate 100 < 120 limit = 100, snap.out.totalCount =2 has_next condition will evaluate 200 < 120 pagination breaks and next page is not processed Next-url condition: $original.URL+"&page_no= " + $headers.page_no+1 next-url explanation: As every API URL output has page number in it, same can be used in the has-next condition and also in incrementing page number to get to the next document. Sample Response: { "statusLine": { "protoVersion": "HTTP/1.1", "statusCode": 200, "reasonPhrase": "OK" }, "entity": [ { "year": "2022", "month": "08", "Name": "Mark", }, { "year": "2022", "month": "08", "Name": "John", },………………. 100 records in this array ], "original": { "effective_date": "2023-08-31", "in_limit": "100",       "URL": "https://Url.XYZ.com" },     "headers": { "page_no": 1 } }  For Second API call: { "statusLine": { "protoVersion": "HTTP/1.1", "statusCode": 200, "reasonPhrase": "OK" }, "entity": [ { "year": "2022", "month": "08", "Name": "Ram", }, { "year": "2022", "month": "08", "Name": "Raja", },………………. 20 records in this array ], "original": { "effective_date": "2023-08-31", "limit": "100",       "URL": "https://Url.XYZ.com" },     "headers": { "page_no": 2 } } Please give us Kudos if the article helps you😍
Vishaal_Arun
New Contributor II
4 months ago
2.4KViews
4likes
2Comments
Array of Objects manipulation
Hi team, I would like to iterate thru an array of objects and verify if the objects has same num, code and date with different boxNumbers, then I should add the boxNumbers together and make that as a single object. If those three didn't match I should leave the object as is. Could you please help me on this? Sample Input data: [ { "product": [ { "num": "69315013901", "code": "C06024", "date": "2026-03-31", "boxNumber": [ "453215578875", "964070610419" ] }, { "num": "69315013901", "code": "C06024", "date": "2026-03-31", "boxNumber": [ "153720699865", "547398527901", "994797055803" ] }, { "num": "69315030805", "code": "083L022", "date": "2025-11-30", "boxNumber": [ "VUANJ6KYSNB", "DPPG4NWK695" ] } ] } ] Expected Output: [ { "product": [ { "num": "69315013901", "code": "C06024", "date": "2026-03-31", "boxNumber": [ "453215578875", "964070610419", "153720699865", "547398527901", "994797055803" ] }, { "num": "69315030805", "code": "083L022", "date": "2025-11-30", "boxNumber": [ "VUANJ6KYSNB", "DPPG4NWK695" ] } ] } ]
Solved
lake
New Contributor
5 months ago
2.1KViews
0likes
4Comments
How to convert a MS Word file to PDF
What are my options regarding file conversion from one format to another? In my particular case, I need to read a word file (.docx) and write it as a .pdf file. Any suggestion on how to accomplish this is welcome. Thank you, Agron Bauta
AgronBauta
New Contributor II
6 months ago
257Views
1like
2Comments
Need Guidance on Dynamic Excel File Generation and Email Integration
Hello Team, I am currently developing an integration where the data structure in the Mapper includes an array format like [{}, {}, ...]. One of the fields, Sales Employee, contains values such as null, Andrew Johnson, and Kaitlyn Bernd. My goal is to dynamically create separate Excel files for each unique value in the Sales Employee field (including null) with all the records, and then send all the generated files as attachments in a single email. Since the employee names may vary and increase in the future, the solution needs to handle dynamic grouping and file generation. I would appreciate any expert opinions or best practices on achieving this efficiently in SnapLogic. Thanks and Regards,
deepanshu_1
New Contributor III
6 months ago
115Views
0likes
1Comment
Platform Memory Alerts & Priority Notifications for Resource Failures
This is more about platform memory alerts. From my understanding, we have alert metrics in place that trigger an email if any of the nodes hit the specified threshold in the manager. However, I am looking at a specific use case. Consider an Ultra Pipeline that needs to invoke a child pipeline for transformation logic. This child pipeline is expected to run on the same node as the parent pipeline to reduce additional processing time, as it is exposed to the client side. Now, if the child pipeline fails to prepare due to insufficient resources on the node, no alert will be generated since the child pipeline did not return anything in the error view. Is there any feature or discussion underway to provide priority notifications to the organization admin for such failures? Task-level notifications won't be helpful as they rely on the configured error limits at the task level. While I used the Ultra Pipeline as an example, this scenario applies to scheduled and API-triggered pipelines as well. Your insights would be appreciated.
Ranjith
New Contributor II
6 months ago
609Views
0likes
1Comment
SnapLogic Execution Mode Confusion: LOCAL_SNAPLEX vs SNAPLEX_WITH_PATH with pipe.plexPath
I understand the basic difference between the two execution options for child pipelines: LOCAL_SNAPLEX: Executes the child pipeline on one of the available nodes within the same Snaplex as the parent pipeline. SNAPLEX_WITH_PATH: Allows specifying a Snaplex explicitly through the Snaplex Path field. This is generally used to run the child pipeline on a different Snaplex. However, I noticed a practical overlap: Let’s say I have a Snaplex named integration-test. If I choose LOCAL_SNAPLEX, the child pipeline runs on the same Snaplex (integration-test) as the parent. If I choose SNAPLEX_WITH_PATH and set the path as pipe.plexPath, it also resolves to the same Snaplex (integration-test) where the parent is running — so the execution again happens locally. I tested both options and found: The load was distributed similarly in both cases. Execution time was nearly identical. So from a functional perspective, both seem to behave the same when the Snaplex path resolves to the same environment. My question is: What is the actual difference in behavior or purpose between these two options when pipe.plexPath resolves to the same Snaplex? Also, why is using SNAPLEX_WITH_PATH with pipe.plexPath flagged as critical in the pipeline quality check, even though the behavior appears equivalent to LOCAL_SNAPLEX? Curious if anyone has faced similar observations or can shed light on the underlying difference.
Solved
Ranjith
New Contributor II
6 months ago
220Views
0likes
2Comments
Inserting large data in servicenow
Hello Team, I am developing a pipeline in SnapLogic where there are 6000000 records coming from snowflake and i have designed my pipeline like this: Parent pipeline: snowflake execute -> mapper where i have mapped one to one field -> group by n with 10000 group size -> pipeline execute where Pool size is 5 and in child pipeline i have used json spliter and service now insert ? what can i do to optimize the performance and make it execute faster in snaplogic, currently it takes much time to execute ? Can someone assist in this regards? Thanks in advance.
deepanshu_1
New Contributor III
6 months ago
350Views
0likes
3Comments
Quick Vote for SnapLogic for the DBTA Readers’ Choice Awards
Calling on our Integration Nation Community: this one’s for you! We’re in the running for Best Data Integration Solution at the DBTA Readers’ Choice Awards - but we need your vote to win. ✅ It’s quick. ✅ It’s easy. ✅ It makes a difference. Vote now 👉 https://lnkd.in/e7hiSGr
Scott
Admin
7 months ago
216Views
0likes
0Comments
Trying to connect to an external SFTP
I have generated a key-value pair, shared the public key with the client and setup a Binary SSH account in the Manager in order to connect to the client's SFTP. Additionally, I have got the Groundplex's external IPs whitelisted at the client side and on our side as well. After all this I am getting the following error when I tried to browse the path using the Directory Browser snap: error: Unable to create filesystem object for sftp://.... stacktrace: Caused by: com.jcraft.jsch.JSchException: Session.connect: java.net.SocketException: Connection reset Caused by: java.net.SocketException: Connection reset reason: Failed to get SFTP session connected resolution: Please check all properties and credentials I am stuck in competing the solution due to this error. So, any help is very much appreciated, thanks!
vgautam64
New Contributor III
7 months ago
426Views
2likes
0Comments