How can I find duplicates in mailerlite (i.e. subscribers with the same phone number) without using millions of operations/credits

I’m trying to create an automation that find subscribers in Mailerlite who have the same phone number, and update one of the fields called “main email” according to other criteria.

The problem is when I started building the automation I realised if I get the list of subscribers and then scan the whole list of subscribers again for each subscriber to find ones with the same phone number, that will require at least the square of the number of subscribers, which for just 10k subscribers would translate to 100M operations.

Is there a better option to do this, so the number of operations don´t increase exponentially?

Hi there,

Export the list of subscribers to a Google sheet and use vlookup to find duplicates. Crate a smaller list of only the duplicates and update them.

Just because something can be done with Make, doesn’t mean it’s the best way to do it.

Or if you want to do it in Make, you can use the Code app to get a sub array with only the items that have a duplicate.

Thank you very much for the idea. The challenge is that this is not a one-off task, but more of an action I want to run often (maybe once a day/week, or even every time a new subscriber is added to find duplicates for that specific subscriber).

Having subscribers with multiple email addresses in our database is causing us issues, mainly because we are sending duplicate SMS and people get annoyed, and we want to find an automatic way to mark one of the email addresses as the main address and use only that one to send SMS.

Every solution I’ve thought in Make involves consuming as many operations as the number of subscribers we have in Mailerlite to find the duplicates for each subscriber, which would obviously cost a fortune and would get worse with every new subscriber we get.

For new subscribers you should be able to use the Watch subscribers module as a trigger and then check for other subs with the same email. This shouldn’t consume that many ops.

So its only the initial sync that will be heavy and you can do the processing inside the Code app for only 4-5 operations.

We are looking for duplicates by phone number, not by email address (email address is the primary key for Mailerlite, so two subscribers cannot share email address). As far as I know, Mailerlite doesn´t have any API to get the list of subscribers filtered by phone number or any other field (you can only filter by email address or ID).

Just an idea; what if you list all the contacts in MailerLite through a List module or Make an API call and then use the array aggregator to while grouping on phone number? Then filter all bundles where a single phone number has an array that contains more than one item, because you know there is a duplicate?

You might run in to trouble if you try to fetch all 10k contacts at once, so it might be useful to create a loop.

Cheers,
Henk

1 Like

That’s what I started doing, and I quickly realised the aggregator module consumes one operation per record, which means 100m operations (10k multiplied by 10k), unless I’m doing it wrong and there is a way for the aggregator to consume less operations?

I’d be happy with the automation consuming 10-20 operations per subscriber analysed, as that would be linear instead of exponential, but at the moment I’m far from that goal.

That is not correct. An array aggregator will only consume 1 credit, however much bundles you input.

Sorry, I was thinking about the iterator, not the aggregator, let me look at that solution you are proposing using the aggregator

You were completely right! Using the aggregator I have been able to get bundles with all the email addresses associated to a phone number using only 2 operations, so the first step of the puzzle is complete, thank you very much!!

I will now try to create the logic to pick the “good record” that contains the data that I will then use to update the other records. Once this is done, I will then check how to do the loop to retrieve all the records and not only the first 3200 retrieved by the Mailerlite API.

Great to hear! This is a very good example of saving credits :slight_smile:

The 3200 items is a hard limit, set for any ‘list’ module of any app in Make. Just to manage memory issues, because this is a lot of data to process. So you will not be able to fetch all 10k contacts at once. Because of that, it might be a solution to use the Make an API Call module to get the contacts per list within a repeater and the increment module (depending on how MailerLite handles pagination) + your logic, so the scenario will fetch a page of the contacts and processes them, then fetches the next page etc etc.

Depending on your logic, you might hit other limitations such as the maximum runtime of 40 min of the scenarios. But worth a try to build and learn from.

As @Stoyan_Vatov mentioned, just because you can build it in Make, doesn’t mean you have to. Some things are just not a right fit, we have to be honest in that. :sweat_smile:

1 Like