How to use Regex in Make?

In regular cases it’s very usefull to use Regex (or Regular Expressions) if you want to extract or replace data in some text. When there always is a similar pattern in your text data, regex is ideal to use. But how can you use regex?

Setup
To be able to develop your own regex, there are a few very helpfull tools and processes which helps you getting started. These are my own personal recommendations, if you have anything extra to add up feel free to comment. Some of the things I use and do to make patterns:

  1. Regex101 for pattern development, syntax help and debugging
  2. Stackoverflow for a lot of questions and answers if you get stuck
  3. A make account and good dose of perseverance

Starting development
Before creating the regex module in Make, I always develop the pattern in Regex101 first. Once you’ve successfuly created your pattern, you can copy it over to the make module. Steps to take:

  1. Make sure you set the Flavor within regex101 on “ECMAScript (JavaScript)”. This is used by Make.
    Screenshot_71

  2. When looking for patterns use the “Quick reference” in the right bottom to search for generic used patterns.
    Screenshot_70

  3. Start the pattern development. If a pattern gets complex, split it up and start with something simple first.

  4. Once you succeeded and copying over the pattern to Make, make sure you group the pattern you want to output with brackets. See more information in the example below.

An example
Lets say I get some HTML data with an URL (href) and I want to extract the URL. It would look like this:

Test string
<a href="www.google.com">Google</a>

Pattern
(?<=href=\").+(?=\")

Output
www.google.com

Now, like stated above, when you copy this over to Make you need to make sure the output you want to retrieve is grouped with brackets. The above pattern will output empty since the output I want is not within brackets (even tho regex101 gives you output).
So the correct pattern would be:

(?<=href=\")(.+)(?=\")

And now in regex101 you will also see it gets grouped:

Within make, you can now use this code in the Regex module and get the data you want to extract.

Happy integrating!

If you have any questions, feel free to place a comment below.
~Bjorn

14 Likes

Wowzers, thanks so much for the neat tutorial, Bjorn :muscle: Seasoned Make users swear by the usefulness of regex, and your post only further underlines it!

1 Like

Thanks @Drivn I’m literally going to bookmark this post for future regex projects. I’ve used some of these tools before, but this is a great overall resource for building regex expressions!

Thanks for the post!

2 Likes

I agree. Regular Expressions can be VERY useful!

In addition to the Text parser module that Bjorn pointed out, you can often just us the replace() function. There are cases, due to the differences in the implementation (like the multiline flag not being allowed), where using the Text parser is basically required. But in all my scenarios I think I have used the Text parser module twice. Everywhere else where I want to use Regex I just use good old replace().

Here is a little info on it from the documentation: String functions

Of course, you can use replace() to, well, replace with Regex. Most often, I use it as Bjorn used the Text parser module in his example, to extract data.

To use it this way requires a little change in perspective. Basically, to extract data, you need to MATCH the entire string, capturing (and replacing) only the part(s) you want.

To extract the same data as in Bjorn’s example using replace() you would get something like this:

replace(<a href="www.google.com">Google</a>;/.*href=\"(.+)\".*/; $1)

Some notes:

  1. In this case the global flag is not required.
  2. The multiline flag is not allowed in replace() (nor needed here)
  3. “$1” refers to the first capture group, the “(.+)” in the middle of the search string.

Jim

4 Likes

6 posts were split to a new topic: Using regex for an optional string

Thanks for that, @Drivn. I noted you are a Make regex flavor user, and hope you can shed some light on my issue.

I have a scenario - mailhook => iterator => awsS3 which:

  1. Accepts an email;
  2. if it has attachments, forward it to the iterator;
  3. if there is a particular “key:value” pattern in the email text, use the “value” as the folder variable in the S3 Put function.

All works, except parsing out the “value” portion in the S3 module.

The pattern I use in the email if I want the folder changed is: [fpath:somefolder/anotherfolder…]. I put it as first line in forward or direct email.

The formula I use in the folder variable of the S3 model is:

{{if(indexOf("[fpath:"; "!=-1"); replace(1.text; "/^.?\[fpath:([^\s\]]+)]?.+$/s"; "$1"); "")}}

(with or without the quoted text)

The issue is that the capture - ([^\s\]]+) captures the ENTIRETY of the text field - that is, $1 shows ALL of the email’s passed 1.text field, not just the capture, even though the following text is clearly outside the capture parentheses. Any thoughts?

…also tried:

if(indexOf("[fpath:"; "!=-1"); replace(1.text; "/^.?\[fpath:([^\s\]]+)]\n"; "$1"); "")

with and without an ungreedy ? after the \n.

@bullit

I don’t think the problem is that “([^\s]]+)” is capturing everything. Rather you are matching/capturing nothing. So you are just getting back the original string. My guess is that the the single line flag does not work with replace.

Try replacing "/^.?\[fpath:([^\s\]]+)]?.+$/s" with "/^.*?\[fpath:([^\s\]]+)]([^a]|[a])*/".

This will accomplish basically the same thing without requiring the single line flag.


Jim - The Monday Man (YouTube Channel)
Watch Our Latest Video: The monday ITEM ID column - What most people don’t know.
Contact me directly here: Contact – The Monday Man

2 Likes

Brilliant. Thank you!

Thank you for the amazing tips!

We have combined HTTP, RegEx and Hash to keep track of changes in websites that have tables with the status of a series of licensing processes. If the Hash for the captures portions of the Http changes, we alert the user of interest of that particular record and include the http table with the updates information.

It was a break-trough for us since the service provider didn’t have any APIs.

3 Likes