Convert Base64 to HEX

I am working on a Snaplogic pipeline to copy data from Oracle to Snowflake. A few tables at source have BLOB column which needs to be copied. Snaplogic sees them as “Base64 encoded data” but Snowflake needs the data in HEX format. I would like to know if its possible to convert the data as required and load into Snowflake tables. I’ve attached a sample pipeline.
TEST_2021_06_18.slp (7.1 KB)

Hi @ADORAISWAMY ,

There is an irregular mapping in your sample pipeline,

This way you are mapping a target path. If you are trying to encode the CONTENT in Base64 format, you should set the mapping like this:

1 Like

Hi Bojan,
Thank you for responding. Since I am converting from base64 to HEX, I need to use “base64.decode” function. But that doesn’t seem to work. I have attached the error.

Appreciate your help.

Bojan,

I am not sure what Snaplogic is trying to tell me here. One one hand, the output view from the Oracle SELECT shows the data as “base64 encided data”…

Screen Shot 2021-06-21 at 11.04.49 AM

But again, when I try to decode the data in the mapper snap, it says the data is not a valid base64 data, which is contradictory…

Can you please advise a course of action ? Thanks.

I’m afraid the SnapLogic UI is rather misleading here. When you see “Base64 encoded data” for a field’s content in the preview output, it really means that the field contains binary data (an array of bytes). If it were really Base64 encoded data, you would just see a string containing the Base64 encoding of the binary data. I think this is because the UI is reflecting the representation of the preview data it’s getting from the control plane, rather than the actual raw data being passed between snaps.

So, the $CONTENT field from Oracle is the binary data itself – a byte array. How were you planning on hex-encoding that? I don’t believe that our expression language provides a hex encoding function similar to the Base64.encode() function it provides.

1 Like

Thanks for clearing that, Peter. From your experience, how does Snaplogic handle copying binary data into Snowflake from any kind of source RDBMS - Oracle BLOB specifically? If Snaplogic can’t handle binary data, then I have to find a way to run snowflake commands from either Snaplogic or ditch snaplogic completely and go another route. snowflake has command to encode/decode data but I am not sure how to run those command from Snaplogic GUI. There isn’t enough documentation on the internet on either Snaplogic and/or Snowflake for me to make a clear determination.

If you can suggest some workarounds/pointers, that will be much appreciated.

Anand

(minor correction: my name is Patrick, not Peter)

I’m afraid I’m completely unfamiliar with our Snowflake snaps, so I’ll ask one of our devs who’s familiar with Snowflake to respond.

BTW, which Snowflake snap(s) are you trying to use? Insert? Bulk Load? other?

Sorry for misstating your name, Patrick. I am using “SNOWFLAKE - BULK UPSERT” snap.

No worries. The doc for that snap suggests you can use the File format option property for this. I think you’d set it to BINARY_FORMAT=BASE64. You would need to ensure your binary content is Base64 encoded using Base64.encode($CONTENT), etc.

2 Likes

@ADORAISWAMY: file format option in our snap can help you here. If the incoming data is HEX format then use file format option `BINARY_FORMAT=HEX.

BINARY_FORMAT = XXX (where XXX = HEX | BASE64 | UTF8)

Reference:

1 Like

@ptaylor @svatada

That did the trick. I see the data in Snowflake and a simple compare shows the data in same in source and target. Thank you both for the help. We can consider the issue closed. Appreciated.

Hi @svatada @ptaylor

I have another issue. Please let me know if you can help. When trying to load the BLOB data to Snowflake(through Snaplogic), some of the data is higher than max limit of 16 MB. The column for blob data is created as “VARBINARY” in Snowflake side. Do you know how we can store data > 16MB? Is there an option to compress data before loading in to S/F ?

`Error creating or writing input data to the temp table "WORKFLOW".MERGE_TEMP_TABLE_cf24af3e_3eaf_440e_a34e_55b9d8b6c635` `Resolution:`
`Please file a defect against the snap` `Reason:`
`Max LOB size (16777216) exceeded, actual size of parsed column is 69322752`
`  File '@~/INPUTVIEW/ae9f011a-ae4e-4895-aed1-5f0693e4e974/snowflakebuklload_data_af7638ff-1e5e-4147-84da-dcac60fe63b3.tmp.gz', line 1, character 69322762`
`  Row 1, column "MERGE_TEMP_TABLE_CF24AF3E_3EAF_440E_A34E_55B9D8B6C635"["$2":2]`
`  If you would like to continue loading when an error is encountered, use other values such as 'SKIP_FILE' or 'CONTINUE' for the ON_ERROR option. For more information on loading options, please run 'info loading_data' in a SQL client.`
`Hide Details...`
`Snowflake - Bulk Upsert[5dc5175f62872f2f905a07fd_8fc8cf38-3a37-4fc1-b81e-d2dc72e6d006 -- 18e89a59-156f-459f-9cac-8b8bd30effac]`
`com.snaplogic.snap.api.SnapDataException: Error creating or writing input data to the temp table "WORKFLOW".MERGE_TEMP_TABLE_cf24af3e_3eaf_440e_a34e_55b9d8b6c635`

Thanks for your help.

Sorry, I cannot; I know next to nothing about Snowflake. I’d try some Google searches and then Snowflake community forums.

Since you are already writing to a BINARY column, you should be able to first compress the data, then base64 encode it using Base64.encode(GZip.compress($CONTENT))

Just FYI - based on Snowflake documentation, the BINARY/VARBINARY data type maximum length is 8MB, not 16MB. Hopefully the compression helps!

@koryknick

Hi Kory,

Thank you for responding. Somehow the compression doesn’t seem to work or doesn’t compress enough. I still get this error from Snaplogic/Snowflake:

Reason:
Snap errors: {ruuid=2fd41aae-8fe9-4d66-88ed-7ea0892d0fc3, label=Snowflake - Bulk Upsert, failure=Error creating or writing input data to the temp table “BWDOCS”.MERGE_TEMP_TABLE_8b8d9379_6356_4a04_bd1f_8352839a952d, reason=Max LOB size (16777216) exceeded, actual size of parsed column is 111438132
File ‘@~/INPUTVIEW/f39bd9bb-a877-4583-b9a0-5ea6cb9aa5f7/snowflakebuklload_data_31dd64d1-8f88-45f2-863e-68563a3eb9cb.tmp.gz’, line 1, character 111438138
Row 1, column “MERGE_TEMP_TABLE_8B8D9379_6356_4A04_BD1F_8352839A952D”["$2":2]

Note the “Max LOB size” displayed in the error above. That is how I got the impression that 16 MB is the max size. Maybe, Snaplogic or Snowflake is displaying incorrect information.

Here is the mapper setting as you suggested:

Also, I tested Oracle’s internal compression utility (utl_compress) on a few rows and the compression ratio isn’t that impressive. Maybe, because binary data is already compressed enough or for whatever reason. This is my test result:

select dbms_lob.getlength(content) orig_blob_size, dbms_lob.getlength(utl_compress.lz_compress(content)) compress_blob_size
from <<table_name>>
where dbms_lob.getlength(content) > (1010241024)
and rownum < 11

ORIG_BLOB_SIZE	COMPRESS_BLOB_SIZE
83685170	83554366
10541235	10379628
17928157	17849033
220556537	220265485
15064844	10929200
96704516	80316259
15255342	15241613
15322232	15302694
17142022	17096265
20777303	20667900

As you can see, compressed size is almost the original size.

Thanks
Anand

I should probably have asked if you know what type of data is stored in that BLOB structure. When you try to compress data that is already compressed (e.g. image/audio/video), the compression won’t gain much and sometimes gets a bit bigger. So in this case, it appears that the data is likely already compressed.

You may need to find another storage model for this data outside Snowflake if it doesn’t fit in the max size field. As @ptaylor recommended, you should probably consult the Snowflake community for suggestions there.

Thanks Kory. We are exploring various other options with Snowflake cust support.