I have an http request that gets links from 1 page of googleapis.com API search results (10 links) and puts them into Google Sheets.
How can I get links from all pages of search results?
blueprint.json + 1 min video attached
Hope you can help.
Thanks, Peter
Hi Peter,
the API is paginated, which means you have to find how pagination works for this API, and you will have to use a Repeater before the HTTP call so that you let the API know you want next page, then next, …, until you reach end of pages.
Can you give me the URL of this API Documentation so that I see how Pagination looks like there?
Benjamin
3 Likes
Sure, my question was more about what this API needs and whether it returns a total hits number or not, which makes things much simpler.
Apparently, it returns a nextPage field until there is no more page to load.
@alex.newpath by chance have you already worked with this API? Do you know if it returns the total number of items or total page?
1 Like
Unfortunately I have never worked with it – I don’t even know what the Programmable Search Engine is that Google provides but I can guess.
I do see this unfortunate limitation though on the start query parameter that drives paging:
start
integer (uint32 format)
The index of the first result to return. The default number of results per page is 10, so &start=11
would start at the top of the second page of results. Note: The JSON API will never return more than 100 results, even if more than 100 documents match the query, so setting the sum of start + num
to a number greater than 100 will produce an error. Also note that the maximum value for num
is 10.
No matter what you’re never going to get more than 100 results even with paging, so I hope that is ok.
And yes it looks like the search response returns a structure that has totalResults in the query key
Custom Search JSON API | Google for Developers
3 Likes
Thanks Alex and Benjamin,
Ok I used a Router not a Repeater before the HTTP call. (Not sure if Repeater is more elegant)
Then I used different pagination parameters eg &start=11 then &start=21 in the HTTP calls.
Hey @pwoodford,
It’s a good idea, but you had to duplicate many times the same modules, making your scenario harder to understand and maintain.
You could avoid this with a Repeater
However; in your example, and with the repeater, let’s assume that we know the number of items (and pages) we want to retrieve.
I say this because the Repeater doesn’t have any “Break” directive, so we’ll have to set the exact number of repeats. There is a way to still make it work if you don’t know in advance the number of pages, but it’s more complex. FYI (spoiler), we are building an Academy training about Pagination
So, what you want to do is:
-
Add a Repeater after Google Sheets, and set it like this
Here I assume you want to load 10 pages
-
Add you HTTP Module, but make the “page” query param dynamic
the 2.i
is my repeater variable. It starts with 0. and it’s incremented every repeat
It means that at the first iteration, sum(i*10;1) will return 1, then second repeat, it will return 11, etc, etc, until 91
-
Add your Iterator and Google Sheet.
Your scenario could become this
But again, same as with your example, if you need more or less pages, you will have to modify the Repeater “Repeats” field. There are ways to load only the relevant number of pages, but it’s much more complex to build.
I hope it helps
Benjamin
4 Likes