News Automation (RSS -> Scraptio -> OpenAI --> Google Sheet): almost there, please help!

BlackCubeLabs · June 2, 2024, 12:39pm

Hi all,

I am working on a news automation, and I am looking for some help.
I have used rss.app but scraptio requires an API key which I don’t have as I am on the free plan, and the paid plans is super expensive. Do you know any alternative?

Also, I would appreciate someone who is willing to spare 10-15 minutes to guide me on the set-up of this automation, I feel it would benefit some extra eyes as I am not 100% sure about the parameters I set.

Thanks a lot!

samliew · June 3, 2024, 5:14am

Welcome to the Make community!

For web scraping alternatives, some apps you can use are ScrapeNinja to get content from the page.

I’ve used ScrapeNinja, and you can use jQuery-like selectors there in the extractor function.

ScrapeNinja also can run the page in a web-browser so it closely emulates what users see, as opposed to just the raw page HTML fetched from the HTTP module.

If you want an example, take a look at Grab data from page and url - #5 by samliew

Alternatively, you can try searching for specific APIs on RapidAPI. It’s quite easy to do the initial set-up.

How to call an API on RapidAPI

Use the HTTP “Make an API Key Auth Request” module.

Create a new keychain connection and insert your RapidAPI API Key.

Key: <YOUR_RAPIDAPI_KEY>
API Key parameter name: X-RapidAPI-Key

You can reuse this RapidAPI keychain for all API calls to RapidAPI, you’ll just need to change the X-RapidAPI-Host value based on the API you are calling.

samliew – request private consultation

Join the unofficial Make Discord server to chat with other makers!

samliew · August 2, 2024, 12:50am

Another more “manual” scraping alternative is to use the HTTP “Make a request” module to fetch the source code of the page, then use AI (OpenAI or Groq) to parse the content of the page and extract the content into JSON/variables.

The most reliable way to parse the content of a web page is to probably use the OpenAI “Transform Text to Structured Data” module, or the free Groq “Create a JSON Chat Completion” module.

Example

Here is a Groq example on how to specify the variables that you want to match from the text content. This setup is also similar to what you’d do with the OpenAI Structured Data module.

Output

(Google’s about page didn’t contain any Google email or phone numbers, so it was left empty)

Hope this helps! Let me know if there are any further questions or issues.

You can also join us in the Make Fans Discord server to chat with other makers. Due to the evolving needs of this community, the Discord invite link can be found elsewhere on this forum. You can either search for it or message me to request an invite.

BlackCubeLabs · August 28, 2024, 3:51pm

Hi,

I have decided to give it a try to RAPIDAPI.
I still get very dirty data. How do i further config to remove what’s unnecessary?

samliew · August 29, 2024, 2:01am

That looks like a whole webpage, so you need to either:

use ScrapeNinja’s Extractor Function feature to get the text from specified elements, or
use stripHTML function together with another AI module to help you extract the required content that you want.

If you want a ScrapeNinja example, take a look at Grab data from page and url - #5 by samliew

For more information on the different methods of web scraping, see Overview of Different Web Scraping Techniques in Make 🌐

Hope this helps! Let me know if there are any further questions or issues.

— @samliew

P.S.: Did you know, the concepts of about 70% of questions asked on this forum are already covered in the Make Academy. Investing some effort into it will save you lots of time and frustration using Make later!

BlackCubeLabs · August 29, 2024, 3:27am

I used stripHTML and it worked. So far so good.
I hope I used it in the right way.

Topic		Replies	Views
RapidAPI Error Code 403 Getting Started api , web-scraping	5	1883	April 29, 2024
How to skip Captcha when logging into a website through HTTP How To web-scraping	4	200	November 12, 2024
RapidAPI connection help How To api	9	531	November 12, 2024
Creating an automation using webscrapers services via API How To api	3	114	December 19, 2024
Changing website integrations Features scenario	4	74	September 27, 2024

News Automation (RSS -> Scraptio -> OpenAI --> Google Sheet): almost there, please help!

How to call an API on RapidAPI

Example

Output

Related topics