Hello, I hope you’re doing well. I need help with scraping data from a website that includes various results such as titles, Notice IDs, departments, and offices. My goal is to extract all this information and save it to Google Sheets or Airtable using an RSS feed.
I’ve attempted to set up a scenario with the HTTP module, but I’m not getting the detailed information I need.
Here are the links:
- The first link shows the results.
- The second link provides the details for each result.
Could someone Help me with this?
Website:
https://sam.gov/search/?page=1&pageSize=25&sort=-modifiedDate&index=opp&sfm[simpleSearch][keywordRadio]=ALL&sfm[status][is_active]=true&sfm[status][is_inactive]=true
https://sam.gov/opp/d6c5cabb7ab14a57b0afae8b154961fb/view
Welcome to the Make community!
For web scraping, some apps you can use are ScrapingBee and ScrapeNinja to get content from the page.
I’ve used ScrapeNinja, and you can use jQuery-like selectors in the extractor function.
ScrapeNinja also can run the page in a real web-browser, loading all the content and running the page load scripts so it closely simulates what you see, as opposed to just the raw page HTML fetched from the HTTP module.
If you want an example, take a look at Grab data from page and url - #5 by samliew
For more information on the different methods of web scraping, see Overview of Different Web Scraping Techniques in Make 🌐
Hope this helps! Let me know if there are any further questions or issues.
— @samliew
P.S.: Did you know, the concepts of about 70% of questions asked on this forum are already covered in the Make Academy. Investing some effort into it will save you lots of time and frustration using Make later!
@samliew
Thank you for the detailed response. However, I’d prefer not to use third-party tools since they require a paid version. Is there another way to accomplish this? I have a ChatGPT-4 subscription.
You can’t use ChatGPT here, as it doesn’t have an API.
You can use HTTP + Groq method if you want free.
See the linked showcase I shared above for details.
Hope this helps! Let me know if there are any further questions or issues.
— @samliew
P.S.: Did you know, the concepts of about 70% of questions asked on this forum are already covered in the Make Academy. Investing some effort into it will save you lots of time and frustration using Make later!
@samliew Why not use ChatGPT for data scraping? What are the main benefits of Groq?
Could you provide a simple demo scenario for the any website
SAM.gov | Search?
The data on this site has a unique column shape, and I want to use RSS feed to handle it.
Because you can’t use ChatGPT here, only OpenAI GPT models via the developer platform.
You didn’t want to pay, hence I assumed you wanted to use a free service like Groq.