How to Dynamically Filter Records to Remove Duplicates

I recently ran into a situation with a client who wanted to send texts to the parents of children in his camp. Problem is, when we queried the data, we returned records for children, some of which had the same parent. Sending to all of the records meant that some parents would receive multiple texts, one for each child they had enrolled in the camp.

Well obviously, this is not an ideal situation, a parent only needs one broadcast text message regardless of how many kids are at the camp.

So, I created an inline dynamic filter system using Data Store and a standard Make filter.

The Data Set
Data Set

As you can see in our sample data set, we have 9 student records, but three of them are duplicates (highlighted colors indicate duplicates).

The desired outcome is to send 6 text messages to 6 parents, not 9 text messages to 6 parents.

How Does the Dynamic Filter Work?

The dynamic filter builds or grows as each record is processed. The filter values use a unique value, in this case the phone number, to build the filter with each bundle that is retrieved from the data source.

The dynamic filter utilizes a single record in a data store. I often have a data store where I save variables from multiple random scenarios. In this case, I have a data store where multiple single records are reused in multiple scenarios.

I create a single text field, called “Variable Value” and manually create the key, like “DynamicFilter1.” Then I always use the the Data Store record that has “DynamicFilter1” as the key for a particular scenario.

Here is the data structure for my variable data store:

There are Three (3) Data Store Modules Used in the Dynamic Filter System

  • Data Store Module #1: Must only be processed 1 time in the scenario.
  • Data Store Module #2: Retrieves the current value of the data store record for each bundle processed
  • Data Store Module #3: Adds the unique record value to the filter value after each bundles is processed. This module must NOT be after the filter.

Data Store Module #1

Data Store module #1 sits at the beginning of our scenario, and must only be processed one (1) time per scenario run.

This scenario simply clears the record value each time the scenario runs:

Data Store Module #2

Data Store module #2 must be positioned after your data source, and be processed once for each bundle record that is retrieved from your data source. The purpose of this module is to get the latest version of your data store record to be used in the filter that appears before the action step (i.e. send the text in our use case).

Retrieve the value of the data store record.

The Dynamic Filter

The filter uses the value from Data Store module #2 to filter previously used values. The filter condition uses the retrieved data store value, and only lets values (phone numbers in our case) pass that are not a part of the data store value.

Data Store module #3

Data store module #3 adds the last bundle to the dynamic filter value, taking the retrieved value from the data store record, and saving the just processed unique record value (phone number) to the data store record. I use a comma to separate each of the values.

Each time a phone number is formatted, it is added to the dynamic filter value.

Let’s look at the output of Data Store module #3:

  1. [BLANK]
  2. ,+12025550155
  3. ,+12025550155,+12025550189
  4. ,+12025550155,+12025550189,+12025550175
  5. ,+12025550155,+12025550189,+12025550175,+12025550168
  6. ,+12025550155,+12025550189,+12025550175,+12025550168,+12025550113
  7. ,+12025550155,+12025550189,+12025550175,+12025550168,+12025550113,+12025550198
  8. ,+12025550155,+12025550189,+12025550175,+12025550168,+12025550113,+12025550198,+12025550168
  9. ,+12025550155,+12025550189,+12025550175,+12025550168,+12025550113,+12025550198,+12025550168,+12025550155

As you can see, with each passing module, the filter value increased by one unique record value (phone number).

After each value was added, the new filter value was retrieved and used in the filter.

When we examine our filter, we can see that three unique record values (phone numbers) did not pass our dynamic filter:

So, in the end our parents only received 1 text, regardless of the number of children they had at the camp.

This isn’t a filter system that I use regularly, but it has come in handy many times.

Do you have a situation where you might use a Dynamic Filter System?

If you get stuck, I’m happy to help you one-on-one!

7 Likes

You’re using the data store to reuse this information in another scenario I assume? I wonder why you would not just build an array and use the deduplicate() function to remove duplicates after formatting the phone numbers consistently. Just trying to understand the datastore part of this solution.

1 Like

No, the Data Store record is used just in this scenario for this purpose.

I don’t much like messing with arrays unless I absolutely have to @alex.newpath…just a persona preference.

But what you suggest sounds like a good solution.

2 Likes

Thanks for sharing your thoughts. I just came across a very similar problem. However, I opted to use a small python script instead of databases to reduce duplicate rows in my csv files. The script was kindly written for me by Chatgpt.

Hi Alex, I am new to Make and struggling a bit with this dedupe process. I am looking for something simpler than above, and saw your comment. I have looked at the Array functions link you included, but I am unsure what module to use for this. The link doesn’t include any documentation on where to implement the array functions.

use the DISTINCT function to dedupe…

just one step needed and no data store.

In this example, I am getting recent posts from a subreddit, then making sure no duplicate authors in case they post twice, then I sort them by score and take the first 10.

1 Like

Hello,

I have same problem. I use a rss feed and i don’t want to use similar subject from multiple author.

How can i do that with distinct ? i need to use data storage first no ?

regards

Heya @hugo2 :wave:

Just stepping in to share a friendly piece of advice.

In the community, it’s generally more effective to start a fresh topic for each new question rather than commenting on older threads. When you initiate a new conversation, you’re more likely to catch the community’s attention and receive the help you need.

I totally understand that you’re joining this discussion because you want to address specific points in this particular topic. No worries! You can always pose your questions in a new topic you create and then link to the other conversations you’re referring to.

Thanks for helping to keep the community organized and making it easier for everyone to share and find the information they need. :pray:

2 Likes