Forum Discussion

amardeep2021's avatar
amardeep2021
New Contributor III
4 years ago

Getting "error_code": 1207 while validating before loading Redshift database

Hi All,

I am trying to load a new redshift table which I have created. While validation, I am getting
“error_code”: 1207
“error_message”: "Invalid digit, Value ‘.’, Pos 9, Type: Integer "
“column_value”: "595546349.04 "

I have tried with DOUBLE PRECISION, REAL to alter the column data type of this redshift table, still getting above error. Also used parseFloat() before using redshift for this field, not working and same error. Does anyone know how to address this? Please let me know.

Regards,
Amar

17 Replies

  • @amardeep2021

    A Redshift 1207 load error indicates the data contains a value outside of the expected range of 0-9.

    You will receive this error if you try to write this value to an INTEGER field in the database which appears to be the case here – maybe changed during your experiments to get this working. If you change the database field to a REAL or DOUBLE PRECISION, you’ll still get a 1207 if I use your value which appears to have a trailing space. After removing the extra space character, I’m able to insert the value into a real or double field.

  • amardeep2021's avatar
    amardeep2021
    New Contributor III

    Thanks for the response Matthew. You are suggesting to use trim() before loading the data for the table data type as double precision or real? Kindly confirm.

    Regards,
    Amar.

  • amardeep2021's avatar
    amardeep2021
    New Contributor III

    Hi Matthew,

    Thanks for the response. I tried with trim(), still getting same issue. When I notice the input mapper snap preview for this redshift bulk load, I see this value:“595546349.04”,sub-expression: without spaces. When I check the error view of redshift bulk load snap, I see this “column_value”:"595546349.04 ".
    How the spaces appended at the end for this, I am not getting. Do you know any other way to address this issue? Please let me know.

    Regards,
    Amar.

    • mbowen's avatar
      mbowen
      Employee

      Hi @amardeep2021

      You still have this issue, but I now can’t reproduce the problem against a real or double field! I may have misspoken and have since tore down the pipeline that I’m sure was getting a 1207 validation error for a number with a trailing space trying to insert into a real field. Let’s back up.

      I’ve attached a pipeline with just two snaps, a JSON Generate and Redshift BulkLoad.

      Here’s the payload for the generator; notice trailing space in number value.

      [
          {
              "id": 1,
              "bigdecimal": "595546349.04 "
          }
      ]
      

      Let’s create the Redshift table to bulk load into. Observe number data type is an integer, so we expect problems.

      CREATE TABLE t1207 (
          id         int,
          bigdecimal integer
      );
      

      Sure enough, the BulkLoad snap reports this error. We expects this.

      [
        {
          "error": "Invalid digit, Value '.', Pos 9, Type: Integer",
          "status_code": "error",
          "line_number": 1,
          "column_name": "bigdecimal",
          "error_code": 1207,
          "column_value": "595546349.04",
          ...
        }
      ]
      

      The type in the error message, Integer, indicates the field we’re trying to insert into. Ok, let’s drop table and re-create with real data type.

      CREATE TABLE t1207 (
          id         int,
          bigdecimal real
      );
      

      Re-running the pipeline succeeds without having to trim the number value, which might make better sense, because you’re still having a problem after doing this indicating problem is something else.

      [
        {
          "id": 1,
          "bigdecimal": 5.9554637E+8
        }
      ]
      

      However, we have lost some precision. This is a big number. So, let’s drop table and re-create as double precision which is probably what we want.

      CREATE TABLE t1207 (
          id         int,
          bigdecimal double precision
      );
      

      Pipeline succeeds and full precision is preserved.

      [
        {
          "id": 1,
          "bigdecimal": 595546349.04
        }
      ]
      

      I tried changing JDBC drivers from Postgres to Redshift and get same result with either. I thought maybe Redshift driver was handling space trimming, but apparently this isn’t a problem with either.

      As I mentioned, BulkLoad first writes data to a CSV file, zips it, and then stores this in S3 for Redshift to source from. My Redshift account and BulkLoad snap don’t have any special properties defined.

      Maybe, you can share a bit more about your configuration, settings, or anything else suspect.

      redshift-bulkload-1207-error_2021_10_27.slp (5.6 KB)

  • amardeep2021's avatar
    amardeep2021
    New Contributor III

    Thanks Matthew. I tried dropping table and tried with Real, Double precision and even with VARCHAR(100). I am still getting same error. Here is the full error message from bulk load error view.

    “reason”:

    “Error occurred at line 1935, for column name annual_revenue , column value 105986489.63 , row value 105986489.63.0011B000028GUpJ.Solita Oy.324.USD.12/5/2019.10/30/2019.6/1/2020 4:48:00 AM.11/14/2019.15.Tier3.5/7/2020.10/30/2019.10/30/2019.1/7/2020 2:24:00 PM.Time Tracking.12/10/2018.11/30/2018.4/9/2020.825…28.6.9/1/2021.Q3-2020.Closed…Yes.Professional Services.Outbound Prospecting (CSD).TimeTracking.0061B00001bB6VEQA0.Solita Platform.Ville Kivinummi.Nordics ME Team.4/9/2020.11/14/2019.5/4/2020 4:48:00 AM…True.NEZA.Nordics.TimeTrackingTier3RoleLookup.TimeTrackingTier3RoleMatch.1.Emerging.0.Tier3.7-Post Prod Support.674.9- Closed/Won.TimeTrackingTier3StageLookup.TimeTrackingTier3StageMatch.1.Customer Base Nordics.Closed.2/27/2020.5/7/2020.LDPTimeTracking , with error code 1207 and error_message Invalid digit, Value ‘.’, Pos 9, Type: Integer .”

    “resolution”:

    “Please ensure all required values are set and the data types/values match the columns defined on the table.”

    “status_code”:

    “error”

    “line_number”:

    1935

    “column_name”:

    "annual_revenue "

    “error_code”:

    1207

    “error_message”:

    "Invalid digit, Value ‘.’, Pos 9, Type: Integer "

    “column_value”:

    "105986489.63 "

    “row_value”:

    "105986489.63.0011B000028GUpJ.Solita Oy.324.USD.12/5/2019.10/30/2019.6/1/2020 4:48:00 AM.11/14/2019.15.Tier3.5/7/2020.10/30/2019.10/30/2019.1/7/2020 2:24:00 PM.Time Tracking.12/10/2018.11/30/2018.4/9/2020.825…28.6.9/1/2021.Q3-2020.Closed…Yes.Professional Services.Outbound Prospecting (CSD).TimeTracking.0061B00001bB6VEQA0.Solita Platform.Ville Kivinummi.Nordics ME Team.4/9/2020.11/14/2019.5/4/2020 4:48:00 AM…True.NEZA.Nordics.TimeTrackingTier3RoleLookup.TimeTrackingTier3RoleMatch.1.Emerging.0.Tier3.7-Post Prod Support.674.9- Closed/Won.TimeTrackingTier3StageLookup.TimeTrackingTier3StageMatch.1.Customer Base Nordics.Closed.2/27/2020.5/7/2020.LDPTimeTracking "

    If you get something, Please help.

    Regards,
    Amar.

    • mbowen's avatar
      mbowen
      Employee

      What is the data type of annual_revenue? The error suggests it is an integer, and not a decimal number. Please verify for me.

  • amardeep2021's avatar
    amardeep2021
    New Contributor III

    Incoming mapper upstream has combination of numbers sample like 47577 & 595546349.04. I have 3 mapper snaps followed by bilk load. Before these 3 mapper I have csv parser snap as I am reading google sheet. In 1st & 2nd mapper input schema is string, target schema is any for both mapper. 3rd mapper input schema is string and target schema is integer. I have removed all the parseFloat() or trim() expression from my pipeline and having double precision in create table. Still I am getting same error while validation with same message shared before.

    Not sure why!

    Regards,
    Amar.

    • mbowen's avatar
      mbowen
      Employee

      @amardeep2021

      The error you are receiving is from the Redshift database trying to insert a decimal into an integer field. Can you tell me the data type of the annual_revenue field ? It seems it is defined as an integer, but we’re trying to store a decimal number into field. So all the snap data transformation, etc, are relevant, but what really matters is the final value getting inserted into the target database, and it seems there is a mismatch here.

      • mbowen's avatar
        mbowen
        Employee

        Here is some sql to emit schema for a table. I’m sure there are other ways in Redshift.

        SELECT "tablename", "column", "type"
          FROM PG_TABLE_DEF
         WHERE schemaname = 'public'
           AND tablename = 't1207'
        
        -> 
        TABLENAME   COLUMN      TYPE
        t1207       id          integer
        t1207       bigdecimal  double precision
        
  • amardeep2021's avatar
    amardeep2021
    New Contributor III

    As I could see the source is from google sheet and it is showing like this. Let me try with BIGINT and get back to you soon.

    • mbowen's avatar
      mbowen
      Employee

      is annual_revenue an integer or a decimal? BIGINT will likely raise the same 1207 error trying to write a decimal value. It if should be an integer, then we should modify the pipeline to write this value as a integer.

  • amardeep2021's avatar
    amardeep2021
    New Contributor III

    Hi Matthew,

    Checked the combination of type possible for this field is below.
    6,515,871,900,000
    6,515,871,900,000
    6,515,871,900,000
    6,515,871,900,000
    6,515,871,900,000
    6,515,871,900,000
    256,800,000,000
    256,800,000,000
    256,800,000,000
    87,351,502,445.84
    76,476,000,000
    64,290,705,800.14
    0.00
    0.00
    Null
    Null

    Do you have any idea which data type will suite for the table creation? Please help.

    Regards,
    Amar.

    • mbowen's avatar
      mbowen
      Employee

      @amardeep2021

      DOUBLE PRECISION if we care about decimals. I see only two values (.84 and .14). If those can be stored as integers, then BIGINT. If BIGINT, we need to handle the decimal values in the pipeline by either rounding or truncating decimal values.

      Oh, I also notice 0.00 and the .00 may be significant. DOUBLE PRECISION would be able to store all of these representations.

  • amardeep2021's avatar
    amardeep2021
    New Contributor III

    Hi Matthew,

    I already tried double precision it gave error again. It have integer, decimals, 0.00 & Null. Does double precision while creating table will work? Kindly help.

    Regards,
    Amardeep Shanmugam.

    • mbowen's avatar
      mbowen
      Employee

      Hi @amardeep2021

      I’m able to insert all of these values into a DOUBLE PRECISION field. When you say you tried double precision and it gave you an error, did you alter the data type of the database field to a DOUBLE PRECISION ? That’s what I thought you meant earlier, but now not sure we are referring to the same thing. The 1207 error you showed earlier was pretty specific about the reason for failure reporting the value and data type of the field it was trying to insert into.

      CREATE TABLE t1207 (
          id         int,
          bigdecimal double precision
      )
      

      JSON payload.

      [
          {
              "id": 1,
              "bigdecimal": "256800000000"
          },
          {
              "id": 2,
              "bigdecimal": 0.00
          },
          {
              "id": 3,
              "bigdecimal": "64290705800.14"
          },
          {
              "id": 4,
              "bigdecimal": "6515871900000"
          },
          {
              "id": 5,
              "bigdecimal": "76476000000"
          },
          {
              "id": 6,
              "bigdecimal": "87351502445.84"
          },
          {
              "id": 7,
              "bigdecimal": null 
          }
      ]
      

      Be aware that you can also create a support ticket for help resolving your issue:

      That said, I have no problem continuing to try to help here. I think you’re close.