Forum Discussion
The DataValidator snap is useful if the constraints are statically known. But, since you’re trying to do this dynamically, we’ll need to find another way.
Can you give an example of what these rules look like (preferably in the JSON format that would be returned by a SQL Select snap).
- Ksivagurunathan7 years agoContributor
Thanks for your quick response. Here is the example - “{SOURCE_COLUMN_NAME: COLUMN_NAME1:[{DATATYPE: NUMBER, MAX_LENGTH: null,PRECISION :25,NUMERIC_SCALE:7,NULLABLE:NO}] }”
“{SOURCE_COLUMN_NAME: COLUMN_NAME2:[{DATATYPE: VARCHAR, MAX_LENGTH: 50 ,PRECISION :null ,NUMERIC_SCALE:null ,NULLABLE:NO}] }”We could change this to different format if required. I have challenge in joining this with actual data.
- tstack7 years agoFormer Employee
The simplest thing would be to collect all the schema documents together with a GroupByN snap and then use a Join snap to add the schema to all the documents.
Can I ask why you want to do the validation yourself instead of letting the database do the validation? If the tables in the DB are already setup with the appropriate constraints, I would think that would be the most straightforward approach.
- Ksivagurunathan7 years agoContributor
we want to validate the data sooner than doing in our batch processing which loads data to table. We could create triggered task and use that to validate the errors in data immediately and that way we could reuse some of the validation code rather developing one for each and every file. Data validation rules are going to be similar for any delimited files.