Help using aggregator

Hello. I need help trying to figuring out how to use the aggregator.

There is a folder on Google Drive that might get more than 1 audio file uploaded into it.
Of this file(s) I need to create a transcription and send everything to a Google Doc.
This Google Doc then would be sent to ChatGPT to do other things and possible I would need to add more documents before sending it to ChatGPT, but I haven’t set this up yet as I first need to figure out the aggregator step.

Right now as I have it in the image, is taking the audio files and waiting in the aggregator for all files to be transcribed. My issue is that instead of just creating 1 Google Doc with all the text, is creating a document per file instead. I also tried the grouping option in the aggregator but same thing.

Settings on aggregator:

Google Doc (I also tried with the Array instead of Key)

Thanks.

1 Like

Hello @DanielCS,

How is your Google Drive Watch Files in a Folder set up?
The problem is the module runs on a schedule (or on-demand as needed), only gets new files since the last time it ran, and only gets up to the max you specify per execution.
That means if you have 15 files waiting to process, but your max is 10, and the schedule is every 15 minutes, then your scenario will only process the first 10, then 15 minutes later, process the other 5.

I feel like the scenario (according to the graphics you posted) is set up correctly.
If anything, you may need to set your max higher?

Hello @Donald_Mitchell
The Google Drive is set up with a max limit of 4 and on demand. Currently I’m testing it with 2 files and when is finally done I probably will increase the limit on the module.
That part of the scenario is running as intended, it takes the 2 files and gets the transcription for the 2 files.
The issue is (*was) that when trying to send it to the Google Doc module after the aggregator, it was creating 2 documents instead of just 1. I need 1 to send all the text to chatgpt, since the audio files are related, I need it this way for chatgpt to have all the data.

*I said was, because I have learn now that in the Google Doc module (or ChatGPT) I can put : {{48.array[1].text}} {{48.array[2].text}}, using [ ] to select both arrays and have 1 document this way. But this now creates another issue to figure out, since I don’t know how many audio files would be at the start, I don’t know how many arrays I would have to select.

1 Like

What if you set your Array Aggregator’s source to the module right before it, #43?
That’s the module whose output you need to aggregate, so maybe that will help.

I am having a similar issue, @DanielCS. I am trying to pull transcripts from a Google Drive folder from meetings held over the previous week. The filter works and is sourcing the right documents (9 of them). Then I’d like the aggregator to get the content from all of these docs and combine it into a single Google Doc and add it to a folder. Instead it creates 9 separate Google Docs, each with a different transcript.

Any suggestions?

Not sure if I’m misunderstanding or expecting something different from what the aggregator does.
If I use the group setting, the output is 2 different keys (like if it were 2 bundles)
If I don’t use the group setting I get an array with the different values for each document. I can use this one to create only 1 document, but can’t figure out yet how to make it 1 variable, so I don’t have to use array[1],[2][3]… since I don’t know how many documents there could be.

@Benj_Eisen
Where you have Content, 6. Array [ ] : Text Content, you can put inside the brackets the array number you want to use from previous module. You said you had 9 documents, so I’m guessing you will have 1 Array with 9 different values call Text Content. You can then have in the Doc module Array [ 1] : Text Content Array [ 2 ] : Text Content … and so on to 9, to create just 1 document. Issue would be if you have more documents than the arrays you set up in the module. Also I’m not sure this is the correct way to do it and/or if there is another way instead of having to put each array into the module settings.

1 Like

No need to use the Group By setting in this case.

Have you tried changing the Array Aggregator’s Source Module? It’s the first option in the Array Aggregator’s settings. Instead of Google Drive - Watch Files in a Folder [46], change that to the Assembly AI Get a Transcript #43.
To be honest, a Text Aggregator might be more appropriate here too.
So, try using a Text Aggregator instead of Array Aggregator and set the source to Assembly AI Get a Transcript instead of Google Drive - Watch Files in a Folder.

I tried the text aggregator sourcing straight from Google Docs Get Content of a Document [4]. It’s doing the same thing where it’s sending the content from each of the 7 docs one-by-one to the Create a Document module, instead of aggregating all the text into a single document.

When I click Explain Flow, it shows the order going in sequence to Create a Document [7] rather than looping back to pull more text from the next doc it found in [4].

@DanielCS the problem with manually inputting each array is that the number of documents changes, as you suggested. The first filter pulls transcripts from the last 7 days, so it’s always going to be a different number of documents/arrays.

Is there a different module to use that would loop back and keep pulling text from the 7 documents before going to the final step of creating the new doc? Aggregated Array and Text Aggregator both seem to do the same things, unless I’m configuring them wrong.

UPDATE: SOLVED

Using the Text Aggregator [9], I set the source as Search for Files Folders [1] and the text from Get Content of a Document [4]. This runs a loop pulling all documents from past 7 days, then aggregates them into a single text string, rather than a bundle or array.



Thanks for the help!!

Sorry for late reply, the account I was doing the test run out of data haha and just got my office to set up a paid account.

Thanks, Text aggregator does merge the text I need into 1 document. My issue with this now, is that I can’t map the name into the document based on the name of the audio files, since those values are not sent after the aggregator. I guess I can put “watch folder” again after the aggregator to take the name now. Will figure this out.

Thanks both, have great weekend!

1 Like

Hi @DanielCS

To get the name after aggregator, add a set variable after watch module and store the name of the file. Then after aggregator, use get variable to call that file name.

P.S: You cannot use watch module in middle of the scenario.

Regards,
Msquare Automation - Gold Partner of Make
@Msquare_Automation

1 Like