Figured this out using Scraping Bee (ScrapeNinja - and its documentation - was not working for me). The essential piece of knowledge I uncovered is that many article authors use the HTML tag or to contain the “body text” of the article. My JSON extraction was simple {“title”:“title”,“body”:“body”,“article”,“article”}. After I get this object I then investigated bot and to select the larger item…and then cut this item down to 40,000 characters (because I wanted to store it in a single cell in a google sheet). Hope this helps someone someday!
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Make with GPT-4o not browse url | 3 | 211 | November 23, 2024 | |
| Multiple Blog scrap and resume with chatgpt | 2 | 399 | April 5, 2024 | |
| News Automation (RSS -> Scraptio -> OpenAI --> Google Sheet): almost there, please help! | 6 | 531 | September 11, 2024 | |
| Using ChatGPT to tell me if news is worth reading and summarising | 2 | 585 | August 21, 2024 | |
| How to access and get data of content of a specific news from a website and add it to a google sheet | 2 | 99 | July 23, 2025 |