How to use Regex in Make?

Here’s probably the best email address validator I have come across.

The regex below is borrowed from chapter 4 of Jan Goyvaert’s excellent book, Regular Expressions Cookbook. What you really want is an expression that works with 999 email addresses out of a thousand, an expression that doesn’t require a lot of maintenance, for instance by forcing you to add new top-level domains (“dot something”) every time the powers in charge of those things decide it’s time to launch names ending in something like .phone or .dog.

Email address regex:

(?i)\b[A-Z0-9._%+-]+@(?:[A-Z0-9-]+\.)+[A-Z]{2,63}\b

Let’s unroll this one:

(?i) # Turn on case-insensitive mode

\b # Position engine at a word boundary

[A-Z0-9._%+-]+ # Match one or more of the characters between brackets: letters, numbers, dot, underscore, percent, plus, minus. Yes, some of these are rare in an email address.

@ # Match @

(?:[A-Z0-9-]+\.)+ # Match one or more strings followed by a dot, such strings being made of letters, numbers and hyphens. These are the domains and sub-domains, such as post. and microsoft. in post.microsoft.com

[A-Z]{2,63} # Match two to 63 letters, for instance US, COM, INFO. This is meant to be the top-level domain. Yes, this also matches DOG. You have to decide if you want achieve razor precision, at the cost of needing to maintain your regex when new TLDs are introduced. 63 letters is the current longest length of a TLD although you rarely find any longer than 10 characters.

\b # Match a word boundary

Note in Make the (?i) modifier is implemented as an option in the text parser module so it must be taken out of the regular expression pattern.

2 Likes