Multiple pages of search results

I have an http request that gets links from 1 page of googleapis.com API search results (10 links) and puts them into Google Sheets.
How can I get links from all pages of search results?
blueprint.json + 1 min video attached





Hope you can help.
Thanks, Peter

Hi Peter,

the API is paginated, which means you have to find how pagination works for this API, and you will have to use a Repeater before the HTTP call so that you let the API know you want next page, then next, …, until you reach end of pages.

Can you give me the URL of this API Documentation so that I see how Pagination looks like there?

Benjamin

3 Likes

It’s the queries property that manages paging

Using REST to Invoke the API | Programmable Search Engine | Google for Developers

2 Likes

Sure, my question was more about what this API needs and whether it returns a total hits number or not, which makes things much simpler.

Apparently, it returns a nextPage field until there is no more page to load.
@alex.newpath by chance have you already worked with this API? Do you know if it returns the total number of items or total page?

1 Like

Unfortunately I have never worked with it – I don’t even know what the Programmable Search Engine is that Google provides but I can guess.

I do see this unfortunate limitation though on the start query parameter that drives paging:

start integer (uint32 format)

The index of the first result to return. The default number of results per page is 10, so &start=11 would start at the top of the second page of results. Note: The JSON API will never return more than 100 results, even if more than 100 documents match the query, so setting the sum of start + num to a number greater than 100 will produce an error. Also note that the maximum value for num is 10.

No matter what you’re never going to get more than 100 results even with paging, so I hope that is ok.

And yes it looks like the search response returns a structure that has totalResults in the query key

Custom Search JSON API | Google for Developers

3 Likes

Thanks Alex and Benjamin,
Ok I used a Router not a Repeater before the HTTP call. (Not sure if Repeater is more elegant)
Then I used different pagination parameters eg &start=11 then &start=21 in the HTTP calls.


Hey @pwoodford,

It’s a good idea, but you had to duplicate many times the same modules, making your scenario harder to understand and maintain.

You could avoid this with a Repeater

However; in your example, and with the repeater, let’s assume that we know the number of items (and pages) we want to retrieve.
I say this because the Repeater doesn’t have any “Break” directive, so we’ll have to set the exact number of repeats. There is a way to still make it work if you don’t know in advance the number of pages, but it’s more complex. FYI (spoiler), we are building an Academy training about Pagination :slight_smile:

So, what you want to do is:

  • Add a Repeater after Google Sheets, and set it like this


    Here I assume you want to load 10 pages

  • Add you HTTP Module, but make the “page” query param dynamic


    the 2.i is my repeater variable. It starts with 0. and it’s incremented every repeat
    It means that at the first iteration, sum(i*10;1) will return 1, then second repeat, it will return 11, etc, etc, until 91

  • Add your Iterator and Google Sheet.

Your scenario could become this

But again, same as with your example, if you need more or less pages, you will have to modify the Repeater “Repeats” field. There are ways to load only the relevant number of pages, but it’s much more complex to build.

I hope it helps

Benjamin

4 Likes