Merge result of 2 queries based on a common key

Dear Community,

This is my first post and be sure I searched a lot before asking. So please don’t be too rude with me if I have made any mistake here.

Here is my use case: I have two different queries resulting in let’s say 1’300 bundles in the first one and 12’000 in the second one, having different information in them. I have a common key in both queries.

I need to consolidate both queries based on the common key and I don’t know how to do it. I tried to use Array Aggregator but got no clue how to use the key here somewhere. I tried to go through arrays using as well Array Aggregators as sources for my consolidation but did not find a way to create a variable using either map and/or get functions, simply because I think it is not possible to use a field resulting from Array Aggregator as a key value in either map or get function (like key = array.key_value instead plain value of the key itself).

Basically, here is what it looks like:

Query 1 gives 1’300 bundles of 3 fields (key, field1, field2) with their respective values

Query 2 gives 12’000 bundles of 5 fields (key, field3, field4, field5, field6) with their respective values

My desired output would be 1’300 bundles of 7 fields (key, field1 to field6), I think you got the idea. If no match exist in Query 2 then at least I keep the key and field1 and field2 in the extract for those, explaining why I expect 1’300 output bundles (and considering that I have no more than 1 correspondance in Query 2 but this is another story).

How you experts would achieve that with something that is reliable and with a very good performance considering the amount of data?

Thanks a lot for your help.

Best,

Tim

Hello @Timbleking and welcome to the Make Community!

I don’t know if there’s an easier way to handle this, but this is how I’ve done it the last couple of times… speaking generically here because you haven’t stated what you want to do with this data.

  • Aggregate both sets of bundles into 2 different arrays - Array1 for Query 1 and Array2 for Query 2.
  • Extract key from both Array1 and Array2 and merge into 1 single array, deduplicate them, sort them, etc…
  • Iterate the array (using an Iterator module) and for each:
  • -Extract field1 and field2 from Array1
  • -Extract field3, field4, field5, field6 from Array2
  • Aggregate everything using a JSON Aggregator (Create a data structure consisting of key, field1, field2, field3, field4, field5, field6) after the Iterator.

The result should be a big JSON file which you can parse (using Parse JSON module).

I am not sure about handling thousands of bundles, might need to just take an overall look at your project and decide if there’s a more efficient way to handle it.

So to recap, you already have two different modules acting as Iterators that each output multiple bundles. Place an Array Aggregator after each of them which will result in two different Arrays. Follow those with an Iterator module, followed by a JSON Aggregator module.

1 Like

Dear @Donald_Mitchell,

I had already included the Array Aggregator after each Query as I was convinced going the array way was the only potential solution, but I hadn’t integrated the trick of consolidating the keys after.

It seems to work, I now have to corroborate that the result is correct but looks very good at first sight.

Trying your method finally showed that yes, we could use a resulting variable from a previous module as a key value in the map() array function, which I tried before and was not working (my fault, obviously :slight_smile: ).

Please allow me testing completely and I will update the topic accordingly to share with the community.

Cheers,

Tim

I should have been more specific about “extract”. What I mean is use map() and get() to get the field/value you need based on the key.

I confirm this is working as explained.

Actually I did not need to merge both arrays but find data in the second one using the key of the first one.

As I have now more than 10’00 records in the first one and 24’000 in the second one, performance is not incredible but by far better than my first try that was to join both queries directly without going thru arrray transformation.

@Donald_Mitchell,

Thanks again for your help.

BR,

Tim

Yes, I got that at first read, no worries.