Split content to array by max length and special character

Hello,
I have a problem with creating specific scenario. I have a text (more than 10k characters) and I need to split it into smaller chunks (max 4k). But there is a catch - I can split it in places where special character exists, let’s check an example:

"First part of the text… (long for 3.5k)
%%%
Second part of the text… (long for 1k)
%%%
Third part of the text… (long for 2.5k)
%%%
Fourth part of the text… (long for 1k)
%%%
Fifth part of the text… (long for 1k)
%%%
Sixth part of the text… (long for 1k)
"
In that example I have 3 blocks with ~3k characters separated with “%%%” special character. So What I want to get is array with 3 elements:

  1. First part…
  2. Second part… Third part…
  3. Fourth part… Fifth part… Sixth part…

I tried to use variable and append some string to it, but every time I had only one value. Tried also to initialise array and append string value (one block of text) and also I couldn’t make it

Hello,

I split like this:
image

And end up with this:
image

Is this what you’ve tried and didn’t work out for you?

2 Likes

Hi @Donald_Mitchell, thank you for your reply, unfortunately I made a mistake on my description (I already corrected that), and input data should look like this:
“First part of the text… (long for 3.5k)
%%%
Second part of the text… (long for 1k)
%%%
Third part of the text… (long for 2.5k)
%%%
Fourth part of the text… (long for 1k)
%%%
Fifth part of the text… (long for 1k)
%%%
Sixth part of the text… (long for 1k)”

And what I need to get is array with 3 elements:

  1. First part…
  2. Second part… Third part…
  3. Fourth part… Fifth part… Sixth part…

Ok, in that case you can still split by the “%%%”.
Then, you can use the get() function on that array, getting each part by its position number.

image

image

3 Likes

Part with splitting is working in my scenario, but combining into array is more complicated. Basically input data which I provided is just an example and every time when I run scenario, the input will be different so I need universal mechanism that will check the length of the text and append into array element.

I can imagine these steps:

  1. Split text into array (as you proposed)
  2. Get first element, calculate length, append into temporary variable
  3. Get second element, calculate length, append if temporary variable won’t be longer than 4000 characters, otherwise save temporary in array and create a new temporary array

You need to split text into chunks no larger than 4000 characters each?

2 Likes

Yes, no larger than 4000 characters.

Yea that’s tough, a job probably better suited for code.

Here’s one way you could look at it.
Divide the total length of the text {{length(text)}} by max number of characters allowed in the chunk:
{{length(text) / 4000}}

Let’s say it’s 15,686 characters.
That’s 15,686 / 4000 = 3.9215 chunks, or 4 if we use the ceil() function on it.

Use a repeater with:
Initial value: 1
Repeats: {{ceil(length(text) / 4000)}} (4 times in our example)
Step: 1

Text Aggregator with:
Text: {{substring(text; (repeater_i - 1) * 4000; (repeater_i * 4000))}}
Row Separator: Other
Separator: %%%

This will essentially grab chunks of 4,000 characters at a time and separate them with “%%%”.
In this example it’s 4 groups: characters
0 - 4000,
4000 - 8000,
8000 - 12,000,
12,000 - 15,686

The result is text you can then split by “%%%” or whatever delimiter you choose.

The only issue is that sometimes the split will occur in the middle of a word, so you’d have to figure out a way to grab fewer characters. This is where it might be better to use code or code in the cloud like 0codekit or something like that.

3 Likes

Thanks @Donald_Mitchell for help. I guess I will try your proposal, but probably I will need to think about moving that part into code because it will be easier to handle that.