Ah, regex. If you’ve made it this far into our series, you’re probably starting to see regex as less of a cryptic jumble of symbols and more like your superpower for text manipulation. 💥 Now that we’ve covered the basics, it’s time to put regex into action! From making sure email addresses are legit to protecting your website from sneaky hackers, regex has some incredibly practical (and cool!) uses.
In this post, we’ll dive into three real-world examples where regex truly shines: validating emails, protecting against XSS attacks, and parsing URLs like a pro.
CASE NO. 1: E-mail validation
Email validation is one of the most common uses of regular expressions, especially in form handling. Whether you’re creating a sign-up form, login form, or any system that requires user email input, ensuring that the email format is valid is crucial to prevent issues down the line.
Also, it seems to be a popular recruitment task for junior-level positions so let’s dig in!
let emailPatternRegex = /^[a-zA-Z0-9._+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
const email = 'example@domain.com';
if (emailPatternRegex.test(email)) {
console.log('Valid Email');
} else {
console.log('Invalid Email');
}
Looks pretty complex, doesn’t it? Let’s break it down into small parts and digest it all one-by-one.
/
the beginning of the regular expression
^
this sign means the beginning of a string (you do not want to match a string that could contain illegal characters beforehand, right?)
[a-zA-Z0-9._+-]+
this is a twofer:
- first, we have [some-regex-inside]+ which means the characters within the square brackets should occur at least once (the plus sign means “1 or more”).
- inside, we have a-zA-Z0-9._%+- which means all lowercase letters [a-z], all uppercase letters [A-Z], all digits [0-9], as well as special characters that may occur in en email address like dot [.], underscore [_], plus sign [+], and a hyphen [-]. All those characters are allowed in an email address.
@
this one is pretty self-explanatory – an email address needs the “at” character.
[a-zA-Z0-9.-]+
same as above, the domain must contain letters (lowercase or uppercase) and/or digits and some special characters like dots [.] or hyphens [-].
Note: This is a simple email validation regex that may not catch all invalid email addresses. Regex is always a compromise between complexity and effectiveness.
\.
This one is tricky when it comes to regexes. The dot [.] is a special character in regexes that means “any single character but a new-line character”. However, when creating a pattern for an email address we need it to contain an actual dot. That’s why we add the backslash [\.] before the dot.
The backslash is called and escape because it lets the special character “escape” its destiny of becoming a special character and allows it to become a literal character.
[a-zA-Z]{2,}
this is also a combo of two regex items:
- first, we have [a-zA-Z] in the TLD (top-level domain) part of the regex which means the all lowercase and uppercase letters characters.
- then, we have curly braces {2,} which means that whatever was before the braces (in our case the [a-zA-Z] array) must be at least 2-character long.
The rest of the code is fairly easy. We call the test() method and log the meaningful message based on the response. Easy peasy.
Note: This regex would match some of the incorrect email addresses like:
john.doe.@email.com
john..doe@email.com
.john.doe@email.com
john--@email.com
john__@email.com
__+-.@email.com
It’s important to understand that no regex is perfect. If you try to create a fool-proof regex that will match 100% of your needs and none of the invalid options, you will essentially create something huge, incredibly hard to read and debug, and prone to error.
Sometimes, it’s easier to create two separate regexes instead of one monster-regex to catch ’em all! 😉
That’s it for today.
Don’t forget to catch the next posts in the series. We’re still going to talk about using regexes to protect ourselves from XSS attacks and parsing URL addresses.
/*
Until next time,
stay positive (like unsigned integer) and keep coding!
eMs
*/
