I have been reading up on web scraping for the last few days and wanted to ask if it is possible to scrape information from sites such as Aliexpress.
Yesterday I saw a tutorial somewhere here and I tried it and it worked. Everything was based on a HTML page.
Then I wanted to try a product from Aliexpress and he gave me a lot of code/scripts where I currently have no access.
After that I found a webscraper (ParseHub). I haven’t tried it yet but it seems to be promising for the beginning.
Nevertheless…
Is there a way to get this data with make that is on the aliexpress page or do I have to use an additional program?
Can someone tell me the best way to do that so i can figure it out?
Hello @Cetryn,
I’ve used ParseHub to scrape different web pages of my customers.
Parsehub gives an easy-to-use approach with just selections of elements on the page. Also, you can grab multiple pages based top level listing and paging as well. They also have easy-to-use API integration as well.
Note:- All suggestions of notes based on my experience, it’s possible that you’ve get different.
P.S.: Always search first, Check Make Academy. If this is helpful, mark it as a solution and Need expert help or have questions? Contact or comment below!
So you basically need to “visit” the site yourself to get the content. This is called Web Scraping.
Web Scraping
For web scraping, a service you can use is ScrapeNinja to get content from the page.
ScrapeNinja allows you to use jQuery-like selectors to extract content from elements by using an extractor function. ScrapeNinja also can run the page in a real web-browser, loading all the content and running the page load scripts so it closely simulates what you see, as opposed to just the raw page HTML fetched from the HTTP module.
You can also use AI-powered web scraping tools like Dumpling AI.
This is probably the easiest and quickest way to set-up, because all you need to do is to describe the content that you want, instead of inspecting the element to create selectors, or having to come up with regular expression patterns.
The plus-side of this is that such services combine BOTH fetching and extracting of the data in a single module (saving operations), and doing away with the lengthy setup from the other methods.
I did some research on Parsehub but the results were somehow incorrect or I was blocked by too many scraps.
I had success with Scrapeninja but I don’t know how to extract specific data. I think I need to learn some code. I tried the extractors and didn’t get the right results.
Do I have to write the extractors myself for my needs? Now i get a huge list of something
As far as I understand it, I have to find the elements (browser console) of the website and then write an extractor that pulls the data from the page. Is that correct?
I am writing to understand the workflow…
I have currently found a semi-automatic workaround. but would like to replace the workflow later.
Did you know, this forum has a Hire a Pro category, where you can post your request for off-site specialised help on other platforms (video call/screenshare/private messaging/etc.)? This may help you get your issue resolved faster especially if it is urgent. It is important to post your request in the Hire a Pro category, as forum members are not allowed to advertise their services elsewhere (like here).
Hello @Cetryn,
By default, ParseHub doesn’t support IP rotation in their free plans. IP Rotation is only available in paid plans. When any webpage is secured against scraping, it will always be a struggle to get data from it. If you need to try to use ParseHub paid plans then also confirm which type of IPs are used. Normal IP or Residential Proxies. check more about it.
P.S.: Always search first, Check Make Academy. If this is helpful, mark it as a solution and Need expert help or have questions? Contact or comment below!
yes, i have tried other methods in the meantime, but with less success. i need to take a closer look. I’ve only dealt with it roughly. thanks for the link. looks promising.