How to Use Regex — A Beginner's Guide to Regular Expressions
At first glance, a regular expression looks like someone fell asleep on the keyboard: ^[\w.-]+@[\w.-]+\.[a-zA-Z]{2,}$. But behind the noise, regex is a precise and enormously powerful tool for searching, validating and transforming text. Once you understand the basic building blocks, it clicks quickly.
What Is a Regular Expression?
A regular expression (regex) is a pattern that describes a set of strings. It is used to search for matches within text, validate input (like email addresses or phone numbers), extract specific parts of a string, or perform find-and-replace operations. Regex is supported in virtually every programming language and in many text editors.
🔬 Practice as you read: Open Toolify's Regex Tester in a second tab and test each pattern from this guide as you go through it.
The Basics
Literal characters
The simplest regex is just plain text. The pattern cat matches the string "cat" anywhere in the input. It would match "caterpillar", "concatenate", and "black cat".
The dot — any character
A dot . matches any single character except a newline. So c.t matches "cat", "cut", "c3t", and even "c t".
Character classes
Square brackets define a character class — a set of characters, any one of which can match. [aeiou] matches any vowel. [a-z] matches any lowercase letter. [0-9] matches any digit. Add a caret inside the brackets to negate: [^aeiou] matches any character that is NOT a vowel.
Shorthand Character Classes
| Pattern | Matches | Equivalent to |
|---|---|---|
| \d | Any digit | [0-9] |
| \D | Any non-digit | [^0-9] |
| \w | Word character | [a-zA-Z0-9_] |
| \W | Non-word character | [^a-zA-Z0-9_] |
| \s | Whitespace | Space, tab, newline |
| \S | Non-whitespace | Everything else |
Quantifiers — How Many Times?
Quantifiers control how many times the preceding element must appear:
| Quantifier | Meaning |
|---|---|
| * | Zero or more times |
| + | One or more times |
| ? | Zero or one time (optional) |
| {3} | Exactly 3 times |
| {2,5} | Between 2 and 5 times |
| {2,} | At least 2 times |
So \d+ matches one or more digits, \w{3} matches exactly three word characters, and colou?r matches both "color" and "colour".
Anchors — Where in the String?
^ matches the start of a string (or line in multiline mode). $ matches the end. \b matches a word boundary.
So ^hello only matches strings that begin with "hello", and world$ only matches strings that end with "world".
Groups and Alternation
Parentheses create a capturing group. (cat|dog) matches either "cat" or "dog" — the pipe | means "or". Groups are also used to apply quantifiers to a sequence: (ab)+ matches "ab", "abab", "ababab", and so on.
A Real-World Example: Email Validation
Let us build a basic email validation pattern step by step:
- One or more word characters, dots or hyphens before the @:
[\w.-]+ - The @ sign:
@ - The domain name:
[\w.-]+ - A dot:
\.(escaped because a bare dot means "any character") - Two or more letters for the TLD:
[a-zA-Z]{2,} - Anchored to the full string:
^and$
^[\w.-]+@[\w.-]+\.[a-zA-Z]{2,}$
This matches "user@example.com" and "first.last@company.co.uk", and rejects "notanemail" or "missing@tld".
Flags
Flags modify how the pattern is applied:
g— global: find all matches, not just the firsti— case-insensitive:catalso matches "Cat" and "CAT"m— multiline:^and$match start/end of each line
Common Pitfalls
- Forgetting to escape special characters: in a regex,
.*+?()[{\^$|all have special meaning. If you want to match a literal dot, write\. - Greedy matching: by default, quantifiers are greedy — they match as much as possible. Add
?after a quantifier to make it lazy:.*?matches as little as possible. - Overcomplicating it: a regex that is perfectly accurate for all edge cases is often not worth the complexity. For critical validation, combine regex with other checks.
Next Steps
The best way to learn regex is to use it. Open the Regex Tester, pick a real-world problem — extracting dates from text, finding UK postcodes, validating phone numbers — and build your pattern one piece at a time. The built-in cheat sheet covers all the key syntax at a glance.