Showing results for 
Search instead for 
Did you mean: 

How to split the string on the basis of length of bytes

New Contributor III

How can i Split string( UTF-8) on the basis of bytes length and not on the basis of character counts.
e.g String > 4000 bytes then split the string into multiple columns each not more than 4000 bytes.
Like, if string1 ~ 10,000 bytes then, first 4000 bytes in column1, next 4000 bytes in column2 and so on…
If String < 4000, then in a single column.


Former Employee

The iconv.encode() function can be used to encode a string into a byte array. For example, the following expression will convert the ‘msg’ property in a document into a UTF-8-encoded byte array:

iconv.encode($msg, 'UTF-8')

There are a few methods for accessing byte arrays in the expression language, unfortunately, they were accidentally left out of the documentation. The byte array type is modeled after the Uint8Array type in JavaScript and the available methods are:

You’ll only need the subarray() method to break up the byte array created by the encode() function.

Since the expression language doesn’t have loops, the sl.range() function will be needed to generate an array for each starting index. We’ll then use the map() method to iterate over the array of indexes and call subarray() on the original UTF-8-encoded byte array. If the result of the above encode() is in the ‘$bits’ document property, we can generate an array that starts at zero and counts up to the length of the byte array in 4,000 byte increments with this expression:

sl.range(0, $bits.length, 4000)

For example, if the encode()'d array was 8,192 bytes long, the following array would be generated:

[0, 4000, 8000]

With these pieces in hand, we can create an array of byte-arrays with this expression:

sl.range(0, $bits.length, 4000).map(start => $bits.subarray(start, start + 4000))

Note that subarray() should clamp the resulting array to the length of the array even when the given end index is larger that the size of the $bits array.

The result of the previous expression is an array of byte-arrays. To turn that into an object, we’ll need to change our map() callback to create a key/value pair that can be fed into the extend() method, like so:

{}.extend(sl.range(0, $bits.length, 4000).map((start, index) => ['column' + index, $bits.subarray(start, start + 4000)]))

New Contributor III


Thanks for the quick reply. But, it seems there is some problem. When I am trying to implement the solution, I am getting below error :
"String type does not have a method named: subarray. "

Please suggest.

Can you provide the expression you are using?

Sounds like you didn’t call iconv.encode() to turn the string into a byte array.