Resume error handling after break retry attempts

I have an automation scenario where the Exa web crawler module can experience errors sometimes. I currently mitigate this in general via a break error handler, and in 80% of the time that is sufficient since the Exa module can continue because of the break retry attempts.

But 20% of the scenarios still lead to an error after the number of retry attempts (currently 3).

That scenario I’d like to mitigate as well. I’d like that (after the the number of break retry attempts) the flow continues but then resuming with another web crawler module from Apify.

So:

  • crawler Exa module error
  • break attempt #1
  • break attempt #2
  • break attempt #3
  • resume with crawler Apify module

Is this possible? Thanks in advance for thinking along.

@stenkate
Wouldn’t using Ignore in the Error Handling Module solve the problem?

It may not be an accurate answer because we don’t know anything about what the current scenario is like, what errors are occurring, or whether we need to retry with errors in the first place.

Thanks for your response, and sorry for my late reply. I prefer Exa’s capabilities of extracting web pages over any other similar service, and also the break error retries already mitigate a lot of the first errors. But sometimes these retries don’t catch all errors unfortunately. And for that scenario (so a break retry 3x does not mitigate the error) I’d like to use an additional web scraping service.

If I’m correct, your proposal of using an ignore error instead of a break retry would mean that I’d miss the to be scraped/extracted web data entirely in case the first Exa attempt fails. I cant miss that data actually, hence I’m looking for a more stable solution.

Thanks for the explanation. I finally understand.
You are right, Break Error Handling is indeed the best choice.

I am not sure if the Error is caused by Exa(?) The easiest solution would be to increase the number of Retries?
I feel that this is probably not practical due to the increased number of operations.

Thanks. The error is caused by Exa in this case: my requested webpages are not (yet) part of their index and the service gives a specific 400 error. My config of a break retry x3 already mititages that error a lot, but I don’t think increasing this break retry will solve the remaining open errors.

That’s why I’m looking for a possible solution where I can use another web scraping service as a resume error handling for the scenario where the 3x break retries still lead to the specific 400 error.

I am sorry, but I have never used the Scraping service itself, so I cannot suggest an alternative.
I hope other Communities can help.

Thanks I understand.

In general I’m hoping to find an answer to the general request to use a break and resume error handling together in one flow. We’ll see!

1 Like