What is your goal?
Affiliate posting on social
What is the problem & what have you tried?
Getting knowledge
Affiliate posting on social
Getting knowledge
Hi
The short answer is- you can use Apify actors (using the apify modules) and then manipulate the data as you need for you final outcomes.
Cheers!
Welcome to the Make community!
So you basically need to âvisitâ the site to get the content. This is called Web Scraping. This can seem fairly simple, but get complex very quickly if you encounter the issues described below.
Are you getting no output from the HTTP âMake a requestâ module? This is because the website has employed anti-scraping measures, and has detected that the visit is not made by a human, and has blocked the request silently by returning no content. Hence, you cannot use normal scraping integrations like the HTTP âMake a Requestâ module to fetch pages from websites like these. This is NOT a Make platform, HTTP, Text Parser, or Regular Expression issue/bug.
Example: Scraping Bee Integration Runtime Error 400
Are you getting NO output from the Text Parser âHTML to Textâ module? This is because there is NO text content in the HTML! The entire page content you are scraping may be likely hosted in a script tag, which is dynamically generated and placed onto the page using JavaScript when run on the userâs web browser (e.g.: when the page loads, or when an action is taken like on scroll).
Make is a server-side runtime environment, so when you use the HTTP modules it only fetches the initial page code, and all script tags are ignored by the Text Parser âHTML to Textâ module because it is not a HTML layout element. Furthermore, the HTTP âMake a requestâ module also does not run any of those scripts, so no content is loaded on the page. Youâll probably get a default message that tells you to enable JavaScript.
Are you getting the same output as the input when using the Text Parser âMatch Patternâ module? Your regular expression pattern may simply be incorrect. A reason for this is that every page is different and only works for a specific page. You also need to ensure that your pattern is built correctly to handle the raw output from the website. One way of building and testing a regular expression pattern is by using a popular tool that I use, regex101.com.
For web scraping, a service you can use is ScrapeNinja to get content from the page.
ScrapeNinja allows you to use jQuery-like selectors to extract content from elements by using an extractor function. This is way easier than coming up with a valid and robust[1] regular expression pattern!
ScrapeNinja also can run the page in a real web-browser, loading all the content and running the page load scripts so it closely simulates what you see, as opposed to just the raw page HTML. It can even perform user actions like clicking on elements on the page!
Example: Grab data from page and url
Use this to test the scraping parameters on web pages:
Use these to build and test the âextractor functionâ:
If you need help with the above tools, please start a new topic.
You can also use AI-powered web scraping tools like Dumpling AI.
This is probably the easiest and quickest way to set-up, because all you need to do is to describe the content that you want via a prompt.
The plus-side of this is that such services combine BOTH fetching and extracting of the data in a single module (saving operations), and doing away with the lengthy setup and maintenance from the other methods described in the previous sections.
For more information on the different methods of web scraping, see my full community blog post here: Overview of Different Web Scraping Techniques in Make đ
Hope this helps! If you are still having trouble, please provide more details.
â @samliew
P.S.: investing some effort into the tutorials in the Make Academy will save you lots of time and frustration using Make!
A robust regular expression is one that is reliable, efficient, and handles various potential inputs and edge cases, and is able to fail gracefully. âŠď¸