I’m totally stuck and hoping someone out there has the answer. Here is my process so far …
I am filtering a list of URL’s out of supplied HTML and source is providing multiples of the URL’s making my resulting array messy in structure and creating duplicates.
I have tried altering the REGEX to ignore URL’s with www and the https:// however later in the planned process a HTTP module is returning errors “Supplied URL is not a real URL”.
I have also tried using the DISTINCT and DEDUPLICATE functions with no luck. Does anyone have any ideas?
Welcome to the Make community!
Instead of “Match Pattern” module, first try using the Match Elements module, which includes URLs.
Then, this will make it much easier to filter out duplicates.
Hope this helps! Let me know if there are any further questions or issues.
— @samliew
P.S.: Investing some effort into the Make Academy will save you lots of time and frustration using Make.
Hi Samliew, thank you so much for your suggestion. Unfortunately I am now only getting wierd URL’s I wasn’t getting before …
How do I specify characters that need to be present ie “.co.za”. I tried that syntax with no joy. The html I have been given seems to have come from Google Maps … does that make a difference? If the REGEX found the URL’s I am looking for by only filtering on URL’s containing “.co.za” then these should still be present in the source data.
P.S.: I took your advice and have signed up for the Academy and am currently about 2 hrs in. Thank you so much for all of your assistance!
Now you can use an Array Aggregator, and a filter in-between to only accept the URLs you want (contains).
Hope this helps! Let me know if there are any further questions or issues.
— @samliew
P.S.: Did you know, the concepts of about 70% of questions asked on this forum are already covered in the Make Academy. Investing some effort into it will save you lots of time and frustration using Make later!