Advice needed on URL Formatting with ScrapeNinja Module

Hello,

I’m currently utilizing the ScrapeNinja module for a project, but I’ve encountered a technical hurdle. The issue arises when inputting website data: if the URL format is incorrect (e.g., “www.make.com” instead of the required “https://make.com”), the module doesn’t function as expected.

The url originates in the calendar form (now way of forcing Https:// there) then its extracted to Airtable (again no way to froce https here) and then it’s mapped to ScrapeNinja from Airtable

I’m looking for insights or methods to ensure that all URLs are automatically formatted to begin with “https://”.

If anyone has experience with this or knows of a module or a function that can handle this transformation, your guidance would be immensely appreciated.

Additionally, if there are best practices or tips for working with URL formats in the context of web scraping modules like ScrapeNinja, I’d be eager to learn more :slight_smile:


Hi @AiOptBiz ,

There is something strange with your URL:


It seems that it doesn’t exist.

PBI

2 Likes

@Philippe_Billet yeah that was a made up site and can be ignored.
Same thing occurs with any other url without https://

this is how it looks like with https://

You can remove all traces of https:// and http://, and add it back at the start.

This will ensure the URL always begins with https://

3 Likes

@samliew thanks a bunch! that did the trick