Scrape Ninja (Real Browser) Not Scraping All Data vs. Dumpling AI

Hi everyone,

I’m encountering an issue with the Scrape Ninja module in Make.com. When I use Scrape Ninja ( Real browser) on some websites, it doesn’t capture all the data available on the page. In contrast, when I switch to using Dumpling AI, it successfully scrapes all the data.

I wanna use Scrape Ninja, so how can I fix this?

Welcome to the Make community!

You need to use the Extractor function of ScrapeNinja to properly extract data from the web page. This is advanced-level, and if you do not know JavaScript, you should continue using Dumpling AI instead.

Otherwise, you can test your extractor function using the SN Scraper Sandbox tool: ScrapeNinja Live Sandbox

So you basically need to “visit” the site yourself to get the content. This is called Web Scraping.

Incomplete Scraping

Are you getting NO output from the Text Parser “HTML to Text” module? This is because there is NO text content in the HTML! The entire page content you are scraping is hosted in a script tag, which is dynamically generated and placed onto the page using JavaScript when loaded and run on the user’s web browser on the client-side. Make is a server-side runtime environment, so using the HTTP modules, you get just the script tags, and those script tags are ignored by the Text Parser “HTML to Text” module because it is NOT a HTML layout element.

Using the Make HTTP “Make a request” does NOT run any of those JavaScript scripts, so there is no content on the page other than a default message that tells you to enable JavaScript.

This is NOT a Make platform, or Text Parser, or Regular Expression issue/bug.

You CANNOT use normal scraping integrations like ScrapingBee or HTTP “Make a request” module to fetch this page’s structure.

You will need to use ScrapeNinja’s “Scrape (Real browser)” module to emulate a real person visiting the site using a web browser, as client-side JavaScript needs to run to parse the JSON data in the script tags, and generate the page structure and content.

For more information and demo using ScrapeNinja, see Scraping Bee Integration Runtime Error 400

Web Scraping

For web scraping, a service you can use is ScrapeNinja to get content from the page.

ScrapeNinja allows you to use jQuery-like selectors to extract content from elements by using an extractor function. ScrapeNinja also can run the page in a real web-browser, loading all the content and running the page load scripts so it closely simulates what you see, as opposed to just the raw page HTML fetched from the HTTP module.

If you want an example, take a look at Grab data from page and url - #5 by samliew

AI-powered “easier” method

You can also use AI-powered web scraping tools like Dumpling AI.

This is probably the easiest and quickest way to set-up, because all you need to do is to describe the content that you want, instead of inspecting the element to create selectors, or having to come up with regular expression patterns.

The plus-side of this is that such services combine BOTH fetching and extracting of the data in a single module (saving operations), and doing away with the lengthy setup from the other methods.

More information, other methods

For more information on the different methods of web scraping, see Overview of Different Web Scraping Techniques in Make 🌐

Hope this helps! Let me know if there are any further questions or issues.

@samliew

P.S.: Investing some effort into the Make Academy will save you lots of time and frustration using Make.

1 Like

Thank you for your earlier assistance! I’m reaching out about a technical hurdle I’ve encountered while using ScrapeNinja’s real browser mode.
I tried Dumpling AI. It looks great for scraping tricky websites, but it’s too expensive for me right now.

Issue:

When attempting to scrape data from this article using ScrapeNinja (Real browser) - (Example - https://thehill.com/homenews/administration/5181338-ssa-bans-general-news-websites/) the tool returns partial HTML content. While the basic structure loads, critical data (e.g., article body, dynamically loaded elements) is missing.

How to do this?

That’s because there are security measures in place on the site to prevent scraping reliably.

My advice is if you can’t get ScrapeNinja to work, try another web scraping service or get someone to figure it out.

See the links in my previous post provided above or on my profile for other options.

You can also use the Hire a Pro category to request for private 1-to-1 assistance via video call/screenshare/private messaging/etc. This may help you get your issue resolved faster especially if it is urgent or contain sensitive information. It is important to post your request in the Hire a Pro category, as forum members are not allowed to advertise their services in other categories like here (even if it’s free/unpaid). Posting in the Hire a Pro category will allow other members to assist you over other forms of communication.

Hope this helps!

@samliew

P.S.: Investing some effort into the Make Academy will save you lots of time and frustration using Make.

Thank you! If you know different tools, please tell me. Thanks!

I found a service called firecrawl.dev. And it’s working perfectly.

3 Likes