I would like some recommendations/directions regarding possible solution to below requirement.
I have created a generic pipeline to read 800 different Mainframe VSAM binary files with different formats and parse them using associated copybooks, validate all the data fields against a specified regular expression and if the file passed validation I will like to create a target Avro file.
Avro formatter requires Avro schema predefined. As my pipelines are generic and can work with any VSAM files as long as you provide file location in S3 and associated copybook I would like to know if there is anyway to generate Avro schema dynamically based on the copybook structure or output of data after copybook parser.
Please note the source files have different records types with each record having different structure.
I would appreciate any directions.