How to get all text and images into one bundle for further processing

Situation:
The blueprint(attached) is only a part of a bigger flow.

  1. I’m getting PDFs from a form upload (up to 5 PDFs)
  2. All the PDFs will be scanned with OCR API from mistral.ai and I get back one bundle per document
  3. All content(markdown) from all bundles need to be merged into one big text for further processing
  4. All images from all bundles will be converted and prepared for further processing

Goal: steps 3 and 4 should create one bundle with one big text and a list of all images for further processing.

What should be achieved:

  • No routings within this part of flow
  • Content from PDFs is available as one big text for the following steps
  • Images from pdf content are available as List (will be generated by mistrals ocr)
  • Having the one big text and all images as one bundle for all the steps afterwards

Why “no routing”:
After this flow, there will be 3 routings based on specific conditions (not part of the blueprint) for storing data in different Salesforce objects.
I want to avoid a lot of tracks. I will stay with 3 tracks afterwards which will be splitted into 6 later.

Needed values from ocr

  • body(collection) > pages(array) > 1 (collection) > markdown
  • body(collection) > pages(array) > 1 (collection) > images (if images where detected)

How can I streamline it.
I’m also open for impulses and ideas to optimize the flow within the giving setup.

Thanks in advance
Frank
Get-all-content-and-images-from-multiple-pdfs.json (25.6 KB)
example-output-from-mistral-2.json (44.0 KB)
example-output-from-mistral-1.json (5.0 KB)

Hey there,

here. You can use this method to go back to one route.

Thanks for this “magic” moment. :wink:

Ok, that works. But it creates another issue.
I couldn’t access my variables from the part before the aggregation due to the setup of text-aggregation selection modul.

I changed the flow a little and it worked. :folded_hands:

Best
Frank