Hi Team,I have one scenario and want to solve it as soon as possible.I have two files : Initial Load , Incremental LoadWe have loaded the initial load data into the DB which contains the columns like : id, name, age, classBut We have the incremental files on the daily basis but there are some case we have the extra column which we have to discard the value and alert to the team that column is added newly How we can do in the pipeline .Incremental file columns:We have to give alert that this column is roll_number is added and also discard the columns. We cant do manually because it is a daily job and me can user mapper in this case. Can anyone help me in this case.Thanks in advance.koryknick dmiller

mohit_jain the solution is very straight forward.1. Get the header from your initial file.2. Merge it with your incremental file.3. Filter match and unmatched in two different output streams.Download solution pipeline Compare_Schema_Discard_load & below is the screenshot.

Comparing the schema and discard the Object | SnapLogic - Integration Nation

14 Replies

darshthakkar
Valued Contributor
3 years ago
Do we have 2 similar threads for this?

Check this link as well - https://community.snaplogic.com/t5/solutions-for-your-business/comparing-the-schema-and-discard-the-new-value/td-p/23295

dmiller
- dmiller
  Former Employee
  3 years ago
  I will merge them.
- mohit_jain
  New Contributor III
  3 years ago
  Hi Team,
  Can anyone help to find the values if the schema is different.
  let me explain with the example:
  We have loaded the initial load into the table:
  Initial Load contains the column like
  Here you can see we have 4 columns ID, Name, Age, Class
  Now we have the incremental load and it contains the columns like
  Here you can see we have 5 columns ID, Name, Age, Class as well as new added Roll_Number
  
  Now We have to discard the roll_number value and send the alter that this column has added newly in this file
  As well as we dont want to do this manually by using mapper we have to schedule the pipeline on this daily basis and there are around 100 files we have to process the data
  
  Can anyone help to create the dynamic pipeline for all the files.
  Thanks In advance
  - darshthakkar
    Valued Contributor
    3 years ago
    mohit_jain : Does the value change everyday on the existing records? Rationale behind asking this is that you can go ahead with Truncate and Load in Snowflake Bulk Load snap.
    I had a similar use case wherein I was getting dynamic payload but that would ingest "Null" values to the existing records for the new columns and you wouldn't be able to troubleshoot on when the records were first ingested.
    So, we asked the business to define the payload (which happened for a few integrations) and for the rest, we are ingesting JSON to snowflake and then snowflake handles it dynamically, hope this helps!
darshthakkar
Valued Contributor
3 years ago
mohit_jain: Would you always get data in the order A, B, C, D?
How is the data sorted? Is it alphabetical? Reason being, if you have columns as "Address", "Birth", "City", "Name" and for there is a new column named "Cameo" then where would it go? Would it after "Name" or after "Birth"?

A simple solution would be to use a mapper snap after your CSV parser and define the fields in the mapper snap OR define the fields in your CSV Parser and enabling the error route so that whenever new columns are detected, those will error out and you can send notifications to the team for that additional column.

Thank you.
- mohit_jain
  New Contributor III
  3 years ago
  Hi darshthakkar
  There are around 100 files we have to process we can't do it manually on the daily basis. As well as there are no data sorted we are comparing two schema with two different files.
  Thanks
  Mohit Jain