10-01-2019 02:36 PM
I have a pipeline that reads from Oracle and writes to PGSQL. When inserting to PGSQL, I get this error
Batch entry 0 insert into “pg_temp”.“crmi002b_ut_pprs001” (“end_date”, “opportunity_type”, “sales_person”, “opportunity_status_type”, “sr_partition_id”, “program_id”, “promotion_id”, “contact_id”, “created_by”, “last_updated_by”, “sr_checksum”, “opportunity_id”, “sr_modification_time”, “entity_code”, “created_on”, “last_updated_on”, “sr_creation_time”, “start_date”) values (cast(NULL as varchar), ‘LOV_OT_INTERACTION’, ‘C_NACAR’, ‘LOV_OS_INTERACTION’, ‘22’::numeric, cast(NULL as varchar), cast(NULL as varchar), ‘151000000’::numeric, ‘C_NACAR’, ‘C_NACAR’, ‘f1503e2c1ad94f56d9deef140da28ead’, ‘122000000’::numeric, ‘1569965343729’::numeric, cast(NULL as varchar), cast(‘2010-12-11 10:51:24-06’ as timestamp), cast(‘2010-12-11 10:51:24-06’ as timestamp), ‘1569965343729’::numeric, cast(NULL as varchar)) was aborted: ERROR: column “end_date” is of type timestamp without time zone but expression is of type character varying
Hint: You will need to rewrite or cast the expression.
Position: 364 Call getNextException to see other errors in the batch., error code: 0, SQL state: 42804
I am unsure why the snap is trying to cast NULLs as varchars for date and int fields. I assume it is because it does not see a good first example with data, so it translates properly as is done in the other fields that have data
Is there a way to bypass this casting, have this fixed? This is not the only pipeline that runs from Oracle to PGSQL, and not the first one that has null values in non-test fields. I find it hard to believe that in all other pipelines like this one the first record always have data for all fields
10-01-2019 02:53 PM
Unsure if this is the reason. We switched schemas from pg_temp to a regular schema, and it works fine. This is the first time we are trying to use pg_temp and it does not work with snap. Is snap trying to get the table definitions through a different connection? It would not see the table in the case if pg_temp
10-02-2019 09:08 AM
It’s… complicated. You should generally assume that different snaps use different connections. In rare situations it looks like a connection is passed from one snap to the next but there should be a reset between snaps. E.g., with postgresql we reset the default schemas.
Within an individual snap it’s… complicated. There’s usually a single connection used during both validation and execution. However if the snap is at the end of a pipeline that has a long delay between launch and the first documents hitting the snap it’s possible that they’ll use a different connection. It’s theoretically possible that a connection will be changed execution but I don’t think that happens in practice. We shouldn’t do anything during validation that’s required during execution but something could have slipped through.
Could you write up two tickets? One with enough details about this pipeline for us to try to duplicate it, and a more general one asking for support for pg_temp? We can check a few things in the current snaps but full support for pg_temp (e.g., guaranteeing the same connection in all snaps) will require a lot of research into its feasibility.
10-04-2019 02:03 PM
@cesar.bolanos did you try defining TIMESTAMP without time zone
?
10-07-2019 09:14 AM
This also happens with int fields