Forum Discussion

Ksivagurunathan's avatar
Ksivagurunathan
Contributor
7 years ago

Dynamic Data Validation

I want to validate the incoming data which will be in CSV format. I have Target table defined for each file. Now I want to develop a common pipeline which should validate the data based on the rule for each file.
Validation Rules - Not null, length and datatype. I can extract these validation information from target database dynamically using sql queries based on the file name. How do I merge this rules with incoming data ? My goal is to develop one pipeline and even if there are changes in database for new field addition or datatype change, I shouldn’t change my pipeline. my sql queries can pull the latest fields, datatype and other req information.

Basically i’m trying to verify the data in the file before I apply them to target table. So I capture error records before I perform my DML operation

9 Replies

  • tstack's avatar
    tstack
    Former Employee

    The DataValidator snap is useful if the constraints are statically known. But, since you’re trying to do this dynamically, we’ll need to find another way.

    Can you give an example of what these rules look like (preferably in the JSON format that would be returned by a SQL Select snap).

    • Ksivagurunathan's avatar
      Ksivagurunathan
      Contributor

      Thanks for your quick response. Here is the example - “{SOURCE_COLUMN_NAME: COLUMN_NAME1:[{DATATYPE: NUMBER, MAX_LENGTH: null,PRECISION :25,NUMERIC_SCALE:7,NULLABLE:NO}] }”
      “{SOURCE_COLUMN_NAME: COLUMN_NAME2:[{DATATYPE: VARCHAR, MAX_LENGTH: 50 ,PRECISION :null ,NUMERIC_SCALE:null ,NULLABLE:NO}] }”

      We could change this to different format if required. I have challenge in joining this with actual data.

      • tstack's avatar
        tstack
        Former Employee

        The simplest thing would be to collect all the schema documents together with a GroupByN snap and then use a Join snap to add the schema to all the documents.

        Can I ask why you want to do the validation yourself instead of letting the database do the validation? If the tables in the DB are already setup with the appropriate constraints, I would think that would be the most straightforward approach.

  • I’m trying to do something similar.
    Did you find a solution to the problem of dynamic data validation?

    • Ksivagurunathan's avatar
      Ksivagurunathan
      Contributor

      yes, we stored our validation rules in some tables/files then used script snap to validate the data against the rule. There is no straight option available within snaplogic for dynamic data validation.

      • philliperamos's avatar
        philliperamos
        Contributor

        That sounds good.

        I pivoted the data using as JSON, and then used a combo of JSON Splitter/Mapper and Data Validator to validate the data. I still had to hard code some JSON generation scripts, but that was the easy part.

        Thanks!