โ10-26-2021 02:39 PM
Hi All,
I am trying to load a new redshift table which I have created. While validation, I am getting
โerror_codeโ: 1207
โerror_messageโ: "Invalid digit, Value โ.โ, Pos 9, Type: Integer "
โcolumn_valueโ: "595546349.04 "
I have tried with DOUBLE PRECISION, REAL to alter the column data type of this redshift table, still getting above error. Also used parseFloat() before using redshift for this field, not working and same error. Does anyone know how to address this? Please let me know.
Regards,
Amar
โ10-27-2021 02:23 PM
You still have this issue, but I now canโt reproduce the problem against a real or double field! I may have misspoken and have since tore down the pipeline that Iโm sure was getting a 1207 validation error for a number with a trailing space trying to insert into a real field. Letโs back up.
Iโve attached a pipeline with just two snaps, a JSON Generate and Redshift BulkLoad.
Hereโs the payload for the generator; notice trailing space in number value.
[
{
"id": 1,
"bigdecimal": "595546349.04 "
}
]
Letโs create the Redshift table to bulk load into. Observe number data type is an integer, so we expect problems.
CREATE TABLE t1207 (
id int,
bigdecimal integer
);
Sure enough, the BulkLoad snap reports this error. We expects this.
[
{
"error": "Invalid digit, Value '.', Pos 9, Type: Integer",
"status_code": "error",
"line_number": 1,
"column_name": "bigdecimal",
"error_code": 1207,
"column_value": "595546349.04",
...
}
]
The type in the error message, Integer, indicates the field weโre trying to insert into. Ok, letโs drop table and re-create with real data type.
CREATE TABLE t1207 (
id int,
bigdecimal real
);
Re-running the pipeline succeeds without having to trim the number value, which might make better sense, because youโre still having a problem after doing this indicating problem is something else.
[
{
"id": 1,
"bigdecimal": 5.9554637E+8
}
]
However, we have lost some precision. This is a big number. So, letโs drop table and re-create as double precision which is probably what we want.
CREATE TABLE t1207 (
id int,
bigdecimal double precision
);
Pipeline succeeds and full precision is preserved.
[
{
"id": 1,
"bigdecimal": 595546349.04
}
]
I tried changing JDBC drivers from Postgres to Redshift and get same result with either. I thought maybe Redshift driver was handling space trimming, but apparently this isnโt a problem with either.
As I mentioned, BulkLoad first writes data to a CSV file, zips it, and then stores this in S3 for Redshift to source from. My Redshift account and BulkLoad snap donโt have any special properties defined.
Maybe, you can share a bit more about your configuration, settings, or anything else suspect.
โ10-27-2021 02:41 PM
Thanks Matthew. I tried dropping table and tried with Real, Double precision and even with VARCHAR(100). I am still getting same error. Here is the full error message from bulk load error view.
โreasonโ:
โError occurred at line 1935, for column name annual_revenue , column value 105986489.63 , row value 105986489.63.0011B000028GUpJ.Solita Oy.324.USD.12/5/2019.10/30/2019.6/1/2020 4:48:00 AM.11/14/2019.15.Tier3.5/7/2020.10/30/2019.10/30/2019.1/7/2020 2:24:00 PM.Time Tracking.12/10/2018.11/30/2018.4/9/2020.825โฆ28.6.9/1/2021.Q3-2020.ClosedโฆYes.Professional Services.Outbound Prospecting (CSD).TimeTracking.0061B00001bB6VEQA0.Solita Platform.Ville Kivinummi.Nordics ME Team.4/9/2020.11/14/2019.5/4/2020 4:48:00 AMโฆTrue.NEZA.Nordics.TimeTrackingTier3RoleLookup.TimeTrackingTier3RoleMatch.1.Emerging.0.Tier3.7-Post Prod Support.674.9- Closed/Won.TimeTrackingTier3StageLookup.TimeTrackingTier3StageMatch.1.Customer Base Nordics.Closed.2/27/2020.5/7/2020.LDPTimeTracking , with error code 1207 and error_message Invalid digit, Value โ.โ, Pos 9, Type: Integer .โ
โresolutionโ:
โPlease ensure all required values are set and the data types/values match the columns defined on the table.โ
โstatus_codeโ:
โerrorโ
โline_numberโ:
1935
โcolumn_nameโ:
"annual_revenue "
โerror_codeโ:
1207
โerror_messageโ:
"Invalid digit, Value โ.โ, Pos 9, Type: Integer "
โcolumn_valueโ:
"105986489.63 "
โrow_valueโ:
"105986489.63.0011B000028GUpJ.Solita Oy.324.USD.12/5/2019.10/30/2019.6/1/2020 4:48:00 AM.11/14/2019.15.Tier3.5/7/2020.10/30/2019.10/30/2019.1/7/2020 2:24:00 PM.Time Tracking.12/10/2018.11/30/2018.4/9/2020.825โฆ28.6.9/1/2021.Q3-2020.ClosedโฆYes.Professional Services.Outbound Prospecting (CSD).TimeTracking.0061B00001bB6VEQA0.Solita Platform.Ville Kivinummi.Nordics ME Team.4/9/2020.11/14/2019.5/4/2020 4:48:00 AMโฆTrue.NEZA.Nordics.TimeTrackingTier3RoleLookup.TimeTrackingTier3RoleMatch.1.Emerging.0.Tier3.7-Post Prod Support.674.9- Closed/Won.TimeTrackingTier3StageLookup.TimeTrackingTier3StageMatch.1.Customer Base Nordics.Closed.2/27/2020.5/7/2020.LDPTimeTracking "
If you get something, Please help.
Regards,
Amar.
โ10-27-2021 02:45 PM
What is the data type of annual_revenue? The error suggests it is an integer, and not a decimal number. Please verify for me.
โ10-27-2021 03:05 PM
Incoming mapper upstream has combination of numbers sample like 47577 & 595546349.04. I have 3 mapper snaps followed by bilk load. Before these 3 mapper I have csv parser snap as I am reading google sheet. In 1st & 2nd mapper input schema is string, target schema is any for both mapper. 3rd mapper input schema is string and target schema is integer. I have removed all the parseFloat() or trim() expression from my pipeline and having double precision in create table. Still I am getting same error while validation with same message shared before.
Not sure why!
Regards,
Amar.
โ10-27-2021 03:10 PM
The error you are receiving is from the Redshift database trying to insert a decimal into an integer field. Can you tell me the data type of the annual_revenue field ? It seems it is defined as an integer, but weโre trying to store a decimal number into field. So all the snap data transformation, etc, are relevant, but what really matters is the final value getting inserted into the target database, and it seems there is a mismatch here.