Hello,
I have a problem with creating specific scenario. I have a text (more than 10k characters) and I need to split it into smaller chunks (max 4k). But there is a catch - I can split it in places where special character exists, let’s check an example:
"First part of the text… (long for 3.5k)
%%%
Second part of the text… (long for 1k)
%%%
Third part of the text… (long for 2.5k)
%%%
Fourth part of the text… (long for 1k)
%%%
Fifth part of the text… (long for 1k)
%%%
Sixth part of the text… (long for 1k)
"
In that example I have 3 blocks with ~3k characters separated with “%%%” special character. So What I want to get is array with 3 elements:
- First part…
- Second part… Third part…
- Fourth part… Fifth part… Sixth part…
I tried to use variable and append some string to it, but every time I had only one value. Tried also to initialise array and append string value (one block of text) and also I couldn’t make it
Hello,
I split like this:
And end up with this:
Is this what you’ve tried and didn’t work out for you?
2 Likes
Hi @Donald_Mitchell, thank you for your reply, unfortunately I made a mistake on my description (I already corrected that), and input data should look like this:
“First part of the text… (long for 3.5k)
%%%
Second part of the text… (long for 1k)
%%%
Third part of the text… (long for 2.5k)
%%%
Fourth part of the text… (long for 1k)
%%%
Fifth part of the text… (long for 1k)
%%%
Sixth part of the text… (long for 1k)”
And what I need to get is array with 3 elements:
- First part…
- Second part… Third part…
- Fourth part… Fifth part… Sixth part…
Ok, in that case you can still split by the “%%%”.
Then, you can use the get() function on that array, getting each part by its position number.
3 Likes
Part with splitting is working in my scenario, but combining into array is more complicated. Basically input data which I provided is just an example and every time when I run scenario, the input will be different so I need universal mechanism that will check the length of the text and append into array element.
I can imagine these steps:
- Split text into array (as you proposed)
- Get first element, calculate length, append into temporary variable
- Get second element, calculate length, append if temporary variable won’t be longer than 4000 characters, otherwise save temporary in array and create a new temporary array
You need to split text into chunks no larger than 4000 characters each?
2 Likes
Yes, no larger than 4000 characters.
Yea that’s tough, a job probably better suited for code.
Here’s one way you could look at it.
Divide the total length of the text {{length(text)}}
by max number of characters allowed in the chunk:
{{length(text) / 4000}}
Let’s say it’s 15,686 characters.
That’s 15,686 / 4000 = 3.9215 chunks, or 4 if we use the ceil() function on it.
Use a repeater with:
Initial value: 1
Repeats: {{ceil(length(text) / 4000)}}
(4 times in our example)
Step: 1
Text Aggregator with:
Text: {{substring(text; (repeater_i - 1) * 4000; (repeater_i * 4000))}}
Row Separator: Other
Separator: %%%
This will essentially grab chunks of 4,000 characters at a time and separate them with “%%%”.
In this example it’s 4 groups: characters
0 - 4000,
4000 - 8000,
8000 - 12,000,
12,000 - 15,686
The result is text you can then split by “%%%” or whatever delimiter you choose.
The only issue is that sometimes the split will occur in the middle of a word, so you’d have to figure out a way to grab fewer characters. This is where it might be better to use code or code in the cloud like 0codekit or something like that.
3 Likes
Thanks @Donald_Mitchell for help. I guess I will try your proposal, but probably I will need to think about moving that part into code because it will be easier to handle that.