Create Avro Dynamic Schema

Question

Hi Team,
I would like some recommendations/directions regarding possible solution to below requirement.
I have created a generic pipeline to read 800 different Mainframe VSAM binary files with different formats and parse them using associated copybooks, validate all the data fields against a specified regular expression and if the file passed validation I will like to create a target Avro file.
Avro formatter requires Avro schema predefined. As my pipelines are generic and can work with any VSAM files as long as you provide file location in S3 and associated copybook I would like to know if there is anyway to generate Avro schema dynamically based on the copybook structure or output of data after copybook parser.
Please note the source files have different records types with each record having different structure.
I would appreciate any directions.

ptaylor · Answer

If you generate a separate Avro schema for each data file, you could save the schema to a file on sldb, then invoke a child pipeline using Pipe Exec, setting a pipeline parameter to specify the Avro schema file’s path. In the Avro Formatter in the child pipeline, the schema setting would be an expression you’d set to the value of the pipeline parameter.

majid · Answer

Thank You @ptaylor … I am looking directions on ways to generate Avro schema dynamically using data file or copybook structure. Once the schema is generated and created in sldb I can sue above method to pass that to child pipeline so that Avro formatter can use it while creating Avro target file in S3.

Forum Discussion

Create Avro Dynamic Schema

2 Replies

Recent Discussions

Pagination and nextCursor in header

Javascript to promote top level lists

Google Sheets Subscribe questions

Basic string transformations not working

Can we generate XML file in pretty print format using native snapLogic snaps?