Get the 'Scraptio' module to pull back a list of links on a website that contain the word "about"

You already know the URL of the domain you were scraping (from the first module), so use that and prepend it.

samliewrequest private consultation

Join the Make Fans Discord server to chat with other makers!

1 Like

Yeah, I suppose, was just worried in case it’s something like “/docs/about-us” - if I just added the original link onto the back of the parsed text it may not get the right url?

What do you mean?

If you added https://www.eurogamer.net to /docs/about-us

you get

https://www.eurogamer.net/docs/about-us

If that link is wrong then it was wrong on the original website in the first place.

samliewrequest private consultation

Join the Make Fans Discord server to chat with other makers!

1 Like

Edit: It seems to now work.

Weirdly though, sometimes it puts the FULL url into the sheet - and other times it just puts the text! (I’m always after the full URL)

This is the pattern I’ve put in the text parser: <a[^<“>]?href=“(?[^<”>]+?)"[^<>]>(?[^<”>]?privacy[^<">]?)</a>

Here’s what the parser outputs look like:

Yes, it all depends on what was in the source code of the link tag.

1 Like

Reckon there’s much I can do to just get the full URL in all cases? I’m wanting to then pass the links into another automation. If I add the ‘original’ URL to the google sheet appended with the parser output, sometimes I’ll get a URL with double the https:// stuff on. If that makes sense.

I’ve tried it here:

… and here’s the output:

You can see in some cases its doubling the URL within the same row…

Add only when it isn’t present.

Something like this

Screenshot_2024-06-10_200646

samliewrequest private consultation

Join the Make Fans Discord server to chat with other makers!

1 Like

Oooo, where do I add that string into?

Your URL (A) field.

samliewrequest private consultation

Join the Make Fans Discord server to chat with other makers!

Not having much joy :worried:


…output:
outputURL

Oh you gotta select the variables from the variables panel.

See how some of the special variables have a background?

Or you can manually type it out like that

{{if(indexOf(2.url; "http") = 0; 2.url; 5.value + 2.url)}}

Here are some useful links and guides you can use to learn more on how to use the Make platform, apps, and app modules. I found these useful when I was learning Make, and hope they might benefit you too —

General

Help Center Basics

Articles & Videos

samliewrequest private consultation

Join the unofficial Make Discord server to chat with other makers!

2 Likes

Hi Sam, you’ve been a great help - I really appreciate your time and patience. I’m new to this, and finding the whole filtering/regex stuff all a bit over my head. You’ve helped me move things forward so thanks for that - you’ve been great. Cheers

3 Likes