How to get information from a platform that doesn't send APIs and has no webhooks

Hello,

I would like to create a trigger that would start the automation every time a post is published on a platform, and extract the title, image, and text of the post. However, this is a custom-built platform, and they currently do not have any APIs or Webhooks. Another issue is also that the platform itself is password protected, so I don’t think scraping would work either.

Is there any other way I could get the information from this platform every time it is posted? Or do I need to contact developers to build a new feature for this? If so, should I then ask them to implement a webhook on their side, and how long would you estimate this would take?

Thanks in advance!

You can try to use the HTTP modules to “pretend” that you are a real person logging into the website, and “scraping” the information you need.

In this example, I am able to login to a website with no API and get the data I need from their web page.

Different websites require different modules, so you cannot copy my example above.


So you basically need to “visit” the site yourself to get the content. This is called Web Scraping.

Web Scraping

For web scraping, a service you can use is ScrapeNinja to get content from the page.

ScrapeNinja allows you to use jQuery-like selectors to extract content from elements by using an extractor function. ScrapeNinja also can run the page in a real web-browser, loading all the content and running the page load scripts so it closely simulates what you see, as opposed to just the raw page HTML fetched from the HTTP module.

If you want an example, take a look at Grab data from page and url - #5 by samliew

AI-powered “easier” method

You can also use AI-powered web scraping tools like Dumpling AI.

This is probably the easiest and quickest way to set-up, because all you need to do is to describe the content that you want, instead of inspecting the element to create selectors, or having to come up with regular expression patterns.

The plus-side of this is that such services combine BOTH fetching and extracting of the data in a single module (saving operations), and doing away with the lengthy setup from the other methods.

For more information on the different methods of web scraping, see Overview of Different Web Scraping Techniques in Make 🌐

Hope this helps! Let me know if there are any further questions or issues.

@samliew

P.S.: Investing some effort into the Make Academy will save you lots of time and frustration using Make.

5 Likes

Ah that is a great one @samliew, thanks so much! Is there a template that I could use for this? Otherwise I will just build something similar myself :slight_smile:

I do need this automation to be “triggered” every time something new is published, is there a workaround for that too? Or would you advise to just add a scheduler that checks for new data and information every hour or so (and if that is the case, how can you actually know IF there is more information; maybe you can then check the post time of the last post…)

No, each site has different login processes, so you can’t use my site’s stuff for yours.

You’ll need to inspect the browser request and try to emulate all the headers and cookies to pretend you are a real browser.

3 Likes

I’ll try that, thanks a lot @samliew!