How to avoid Captcha when using HTTP module

Daniel_Attard · March 1, 2023, 2:50am

I am using the HTTP module, trying to do a GET request to this URL:

https://canlii.org/en/on/onarb/doc/2023/2023canlii3033/2023canlii3033.html

When you manually copy/paste this URL into a browser, it works fine. The problem arises when I use the HTTP module which triggers a Captcha and HTTP 429 Too Many Requests response as shown below.

My question is: What are some strategies that I can use to avoid triggering the Captcha so that I can GET the page?

ManishMandot · March 1, 2023, 12:01pm

The best way to avoid is to add a SLEEP module with a 15-20 seconds delay in each request.

Daniel_Attard · March 1, 2023, 2:58pm

Thanks for your suggestion Manish, but after implementing the SLEEP module, I am still receiving the same Captcha page and 429 error. Are you able GET the contents of the page using an HTTP request?

Topic		Replies	Views
How to skip Captcha when logging into a website through HTTP How To web-scraping	4	193	November 12, 2024
429 rate limit error, how do i make my module wait and try again? How To api	6	572	August 18, 2024
Using a HTTP request for scraping How To error	2	318	June 19, 2024
Error 429 with Make and Botpress Requests How To error	3	144	November 29, 2024
Error "429" or Rate Limit Error Showcase faqs	1	676	September 3, 2024

How to avoid Captcha when using HTTP module

Related topics