Forum Discussion

Contributor

7 years ago

Dynamic Data Validation

I want to validate the incoming data which will be in CSV format. I have Target table defined for each file. Now I want to develop a common pipeline which should validate the data based on the rule f...

Ksivagurunathan

Contributor

7 years ago

Thanks for your quick response. Here is the example - “{SOURCE_COLUMN_NAME: COLUMN_NAME1:[{DATATYPE: NUMBER, MAX_LENGTH: null,PRECISION :25,NUMERIC_SCALE:7,NULLABLE:NO}] }”
“{SOURCE_COLUMN_NAME: COLUMN_NAME2:[{DATATYPE: VARCHAR, MAX_LENGTH: 50 ,PRECISION :null ,NUMERIC_SCALE:null ,NULLABLE:NO}] }”

We could change this to different format if required. I have challenge in joining this with actual data.

tstack

Former Employee

7 years ago

The simplest thing would be to collect all the schema documents together with a GroupByN snap and then use a Join snap to add the schema to all the documents.

Can I ask why you want to do the validation yourself instead of letting the database do the validation? If the tables in the DB are already setup with the appropriate constraints, I would think that would be the most straightforward approach.

Ksivagurunathan
Contributor
7 years ago
we want to validate the data sooner than doing in our batch processing which loads data to table. We could create triggered task and use that to validate the errors in data immediately and that way we could reuse some of the validation code rather developing one for each and every file. Data validation rules are going to be similar for any delimited files.

Forum Discussion

Dynamic Data Validation

Recent Discussions

Way to lock down in Prod org to "Monitor" only access?

trace API and proxy calls

Pagination Logic Fails After Migrating from REST GET to HTTP Client Snap

Pipeline Execute Pool size

Concat values of a field based on value of another field