In regular cases it’s very usefull to use Regex (or Regular Expressions) if you want to extract or replace data in some text. When there always is a similar pattern in your text data, regex is ideal to use. But how can you use regex?
Setup
To be able to develop your own regex, there are a few very helpfull tools and processes which helps you getting started. These are my own personal recommendations, if you have anything extra to add up feel free to comment. Some of the things I use and do to make patterns:
Regex101 for pattern development, syntax help and debugging
Stackoverflow for a lot of questions and answers if you get stuck
Starting development
Before creating the regex module in Make, I always develop the pattern in Regex101 first. Once you’ve successfuly created your pattern, you can copy it over to the make module. Steps to take:
Make sure you set the Flavor within regex101 on “ECMAScript (JavaScript)”. This is used by Make.
When looking for patterns use the “Quick reference” in the right bottom to search for generic used patterns.
Start the pattern development. If a pattern gets complex, split it up and start with something simple first.
Once you succeeded and copying over the pattern to Make, make sure you group the pattern you want to output with brackets. See more information in the example below.
An example
Lets say I get some HTML data with an URL (href) and I want to extract the URL. It would look like this:
Now, like stated above, when you copy this over to Make you need to make sure the output you want to retrieve is grouped with brackets. The above pattern will output empty since the output I want is not within brackets (even tho regex101 gives you output).
So the correct pattern would be:
(?<=href=\")(.+)(?=\")
And now in regex101 you will also see it gets grouped:
Thanks @Drivn I’m literally going to bookmark this post for future regex projects. I’ve used some of these tools before, but this is a great overall resource for building regex expressions!
In addition to the Text parser module that Bjorn pointed out, you can often just us the replace() function. There are cases, due to the differences in the implementation (like the multiline flag not being allowed), where using the Text parser is basically required. But in all my scenarios I think I have used the Text parser module twice. Everywhere else where I want to use Regex I just use good old replace().
Here is a little info on it from the documentation: String functions
Of course, you can use replace() to, well, replace with Regex. Most often, I use it as Bjorn used the Text parser module in his example, to extract data.
To use it this way requires a little change in perspective. Basically, to extract data, you need to MATCH the entire string, capturing (and replacing) only the part(s) you want.
To extract the same data as in Bjorn’s example using replace() you would get something like this:
The issue is that the capture - ([^\s\]]+) captures the ENTIRETY of the text field - that is, $1 shows ALL of the email’s passed 1.text field, not just the capture, even though the following text is clearly outside the capture parentheses. Any thoughts?
I don’t think the problem is that “([^\s]]+)” is capturing everything. Rather you are matching/capturing nothing. So you are just getting back the original string. My guess is that the the single line flag does not work with replace.
Try replacing "/^.?\[fpath:([^\s\]]+)]?.+$/s" with "/^.*?\[fpath:([^\s\]]+)]([^a]|[a])*/".
This will accomplish basically the same thing without requiring the single line flag.
We have combined HTTP, RegEx and Hash to keep track of changes in websites that have tables with the status of a series of licensing processes. If the Hash for the captures portions of the Http changes, we alert the user of interest of that particular record and include the http table with the updates information.
It was a break-trough for us since the service provider didn’t have any APIs.