How can I aggregate output bundles from multiple operations?

Hi everyone,

Here is my current scenario:

The first Dropbox: Watch Files trigger can find multiple files. For example, when two files are found, the scenario goes to the end for the first file first (this is recorded as Operation 1 in the 2nd, 3rd, and 4th modules), and then the scenario goes to the end for the second file (this is recorded as Operation 2 in the 2nd, 3rd, and 4th module). In this particular case, each operation outputs 3 bundles. So, the first file provides Operations 1 to 3 in the last Airtable module first, and then the second file provides Operations 4 to 6 in the Airtable module.

What this scenario does basically is the following:

  1. Analyze image files to extract texts included in the image files.
  2. Save the texts in an Airtable table.

But, it’s possible that the same texts are included in multiple image files, and I would like to remove any duplicated texts before they are saved to Airtable.

I tried to use aggregators (text, array…) between OpenAI and Text parser modules and also tried between Text parser and Airtable modules so that I can remove any duplicates before they are saved in Airtable.

But, as I wrote above texts extracted from the first image are saved even before texts are extracted from the second file. So, the aggregators I tried didn’t work as I expected.

The number of image files can be a lot more than two. How can I put all texts extracted from all images into the same array so that I can remove any duplicates?

I tried to use an Array aggregator right after the Dropbox: Watch Files trigger to find the number of files and then the Repeater module after the Array aggregator to repeat the subsequent modules, except for the last Airtable module, the number of times found in the Array aggregator.

But, it’s getting more complicated for me (a beginner). I mean, in that case, I need to provide a different path to Dropbox: Download a File module for every repeat, but I don’t know how.

It would be great if someone could provide hints for me to move forward to accomplish my goal.

Thank you.

Kaz

Hello,

If you want to detect a duplicate after the text parser, just before Airtable, and if you only want to store the text, then you can add a Text aggregator, with the first DropBox module as the « source module », row separator is for instance comma and for Text, the output of the text parser.
It will generate one string with a coma-separated list of texts. Then you add an Iterator, and in the Array field, you add 2 functions: deduplicate(split(array from previous step;,))
The functions will generate an array from the new string and will remove any duplicate.
Then the iterator will generate new bundles from the remaining texts.
In Airtable you will map the « value » field of the iterator.

I hope I’m clear, and I hope it’s what you wanted.

Let me know if you need an example.

Benjamin

2 Likes

Thank you very much, Benjamin. It took some time, but I was able to solve the problem I had. I appreciate it!

3 Likes