Hello! new user here.
I built a scenario to do a search in newly released scientific articles on the arxiv repository. It needs to compare the new results, every day, to a list I have in a data store (the idea is to get a notification for every article released by a person in the list). I need help in optimizing it (today it took 5k operations!) and to fix some issues I can’t seem to solve.
First item gets the RSS feed from arxiv (works fine). This gives a list of articles, each with a number of authors. The second module creates a list of authors for each article (something I can later iterate on), and seems to work fine. The third module iterates on the bundles created by the “set variable”. This allows to run an instance of the remaining modules for each author, rather than each article, and seems to work ok.
Now for the difficult part. For each author I need to check whether the last name appears on my list of last names from data store, AND, if the first letter of the first name matches. This is needed because last names are usually in full, while first names are sometimes only initials.
Currently I am trying to use the function to get the last word in the Author’s name. Previously I tried using get(split(57.Value; space),2)
and it works, but sometimes we have middle names and so I need to check also for get(split(57.Value; space),3)
and get(split(57.Value; space),4)
wasting operations A LOT. So I am hoping always gets me the last word, but I finished my paid operations and cannot test it…
The first letter of the first name seems to work.
After this check it sends to a slack channel the resulting papers for me to read.
Issues:
- I can’t waste operations checking for
get(split(...),2 and 3 and 4)
so I would like to use to get the last name. - The operations in the data store seems to be repeating (this morning it did ~600) and I don’t why. I also got the same article repeated multiple times in slack, so something is wrong. (see below)
- Sometimes I get authors with difficult letters, and they would never match perfectly as arxiv reports them like e.g. “Wojciech G'orecki”. I can’t think of a way of fixing this although it is a minor issue.
- I was hoping there would be a faster way to compare strings. I’d like this to run using fewer operations, as I “only” have 20k a month and need this every day.
Repeated messages on slack making me think there is some wrong iterations wasting operations…
Thanks to anyone that can help!
Lorenzo