Issue with Scrapeninja Extraction

Hello everyone,

I hope you’re doing well. I’ve recently started using Scrapeninja to extract data from an HTML page, but I’m having trouble retrieving the content of a <span> element with a specific class.

Here’s an excerpt from the HTML code:

<span class="e-f-ih" title="181 utilisateurs">181 utilisateurs</span>

I’m trying to extract the text inside the <span> with the “target-class” class. However, my attempts don’t seem to be working as expected.

Here’s the code I’m currently using:

Capture d'écran 2023-11-30 092111

Could someone help me understand what might be wrong with my approach? Is there anything specific to consider with Scrapeninja when extracting content from HTML classes?

Here is the answer :

Thanks in advance for your help!

Welcome to the Make community!

1.

Can you please provide the URL of the website your are scraping?

2. Scenario blueprint

Please export the scenario blueprint file to allow others to view the mappings and settings. At the bottom of the scenario editor, you can click on the three dots to find the Export Blueprint menu item.

Screenshot_2023-08-24_230826
(Note: Exporting your scenario will not include private information or keys to your connections)

Uploading it here will look like this:

blueprint.json (12.3 KB)

Following these steps will allow others to assist you here. Thanks!

2 Likes

Hi @baptiste.jules and welcome to the Make forum!

I’m not familiar with Scrapeninja but the following seems to be working

function extract(input, cheerio) {           
    let $ = cheerio.load(input);
    return {
        title: $('.e-f-ih').text().trim()
    };
}

and returns

{
    "title": "181 utilisateurs"
}

Is that what you wanted to extract?

1 Like

The URL of the site is as follows: https://chrome.google.com/webstore/detail/zeliq-extension/mekpojdmdfchpokdinnplhlbbdbebiph?hl=fr
And here is the Export Blueprint
blueprint.json (23.8 KB)

I want to extract the number of users.

1.

Looks like

https://chrome.google.com/webstore/detail/zeliq-extension/mekpojdmdfchpokdinnplhlbbdbebiph?hl=fr

redirects to
https://chromewebstore.google.com/detail/zeliq-extension/mekpojdmdfchpokdinnplhlbbdbebiph?hl=fr

So you need to use the second URL, or turn this option on:
Screenshot_2023-11-30_171143

2.

Looks like the selector '.e-f-ih' does not select any elements, in the second URL.

3 Likes

@baptiste.jules
This will extract the number only:

function extract(input, cheerio) {           
    let $ = cheerio.load(input);
    return {
        users: $('.e-f-ih').text().split(" ",1)[0]
    };
}

and return

{
    "users": "181"
}
2 Likes

Hmm, there might be two versions of the web store.

Anyway, I got it working like this:

function (input, cheerio) {             
    let $ = cheerio.load(input);
    const [ _match, users ] = $('.dSsD7e .F9iKBc').text().match(/(\d+) \w+$/);
    return { users: users ? Number(users) : 0 };
}

Output

4 Likes

Thank you, it works.