Regular expressions (regex) are one of the most powerful tools in a developer’s arsenal — and one of the most feared. This cheat sheet covers everything you need, from basic syntax to advanced patterns, with practical examples you can use right now.

Try it live: Use the Genbox Regex Tester to test any pattern from this guide directly in your browser.

Table of Contents


Basic Syntax

At its core, a regex pattern is a sequence of characters that defines a search pattern. Every literal character in a regex matches itself:

  • Pattern cat matches the string “cat” anywhere in the text
  • Pattern 123 matches the literal string “123”

Special characters (called metacharacters) have special meaning and must be escaped with a backslash \ if you want to match them literally:

. ^ $ * + ? { } [ ] \ | ( )

PatternMatches
\.A literal period
\$A literal dollar sign
\*A literal asterisk
\\A literal backslash

Character Classes

Character classes let you match any one character from a set.

Built-in Shorthand Classes

ShorthandEquivalentMatches
\d[0-9]Any digit
\D[^0-9]Any non-digit
\w[a-zA-Z0-9_]Word character (letter, digit, or underscore)
\W[^a-zA-Z0-9_]Non-word character
\s[ \t\r\n\f\v]Whitespace (space, tab, newline, etc.)
\S[^ \t\r\n\f\v]Non-whitespace
.(any except \n)Any character except newline

Custom Character Classes

Use square brackets [ ] to define a custom set:

PatternMatches
[aeiou]Any lowercase vowel
[A-Z]Any uppercase letter
[0-9a-f]Any hex digit (lowercase)
[^abc]Any character except a, b, or c
[a-zA-Z]Any letter

Note: Most special characters lose their special meaning inside [ ]. The only special characters inside a character class are ], \, ^ (at start), and - (between chars).


Quantifiers

Quantifiers specify how many times the preceding element should match.

QuantifierMeaningExample
*0 or morea* matches "", “a”, “aa”, “aaa”
+1 or morea+ matches “a”, “aa”, “aaa”
?0 or 1colou?r matches “color” and “colour”
{n}Exactly n times\d{4} matches exactly 4 digits
{n,}n or more times\d{2,} matches 2 or more digits
{n,m}Between n and m times\d{2,4} matches 2–4 digits

Greedy vs Lazy

By default, quantifiers are greedy — they match as much as possible. Add ? to make them lazy (match as little as possible):

Input:   <b>bold</b> and <i>italic</i>
Greedy:  <.+>   → matches "<b>bold</b> and <i>italic</i>"
Lazy:    <.+?>  → matches "<b>", "</b>", "<i>", "</i>"
GreedyLazyBehavior
**?0 or more, lazy
++?1 or more, lazy
???0 or 1, lazy
{n,m}{n,m}?n–m times, lazy

Anchors

Anchors match a position in the string, not a character.

AnchorMatches
^Start of string (or start of line in multiline mode)
$End of string (or end of line in multiline mode)
\bWord boundary (between \w and \W)
\BNon-word boundary
\AStart of string (Python/Ruby; not supported in JS)
\ZEnd of string (Python/Ruby; not supported in JS)

Examples:

^\d+$         → Entire string is digits only
\bcat\b       → The word "cat" (not "catch" or "concatenate")
^Hello        → String starts with "Hello"
world$        → String ends with "world"

Groups and Backreferences

Capturing Groups

Parentheses ( ) create a capturing group, which saves the matched text for later use.

(\d{4})-(\d{2})-(\d{2})

Applied to "2026-04-14":

  • Group 1 → 2026
  • Group 2 → 04
  • Group 3 → 14

Non-Capturing Groups

Use (?:...) when you need grouping but don’t need to capture:

(?:https?|ftp)://    → Matches "http://", "https://", or "ftp://"
                        without capturing the protocol

Named Capturing Groups

Name your groups with (?<name>...) for readable references:

const re = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const m = re.exec('2026-04-14');
console.log(m.groups.year);  // "2026"
console.log(m.groups.month); // "04"

Backreferences

Refer back to a captured group with \1, \2, etc. (or \k<name> for named groups):

(['"])(.*?)\1    → Matches text in matching quotes (single or double)

Lookaheads and Lookbehinds

Lookaround assertions check what’s before or after a position without including it in the match.

SyntaxTypeDescription
(?=...)Positive lookaheadMatch if followed by ...
(?!...)Negative lookaheadMatch if NOT followed by ...
(?<=...)Positive lookbehindMatch if preceded by ...
(?<!...)Negative lookbehindMatch if NOT preceded by ...

Examples:

\d+(?= dollars)    → Matches a number only if followed by " dollars"
                     "100 dollars" → matches "100"
                     "100 euros"   → no match

(?<=\$)\d+         → Matches digits only if preceded by "$"
                     "$500 and €200" → matches "500"

\b\w+(?!ing)\b     → Approximate: words NOT ending in "ing"

Flags

Flags change how the regex engine interprets the pattern.

FlagJSPythonEffect
Globalg(N/A, use findall)Find all matches
Case-insensitiveire.IGNORECASECase-insensitive matching
Multilinemre.MULTILINE^/$ match line boundaries
Dotallsre.DOTALL. matches \n
Unicodeu(default in Python 3)Full Unicode support
Verbose(N/A)re.VERBOSEAllow comments and whitespace in pattern

Common Patterns

Copy-paste ready patterns for common validation tasks:

Email Address

^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$

URL

https?:\/\/(www\.)?[-a-zA-Z0-9@:%._+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_+.~#?&/=]*)

IPv4 Address

\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b

Date (YYYY-MM-DD)

^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$

US Phone Number

^\+?1?\s?(\(\d{3}\)|\d{3})[\s.\-]?\d{3}[\s.\-]?\d{4}$

Strong Password (8+ chars, uppercase, lowercase, digit, special)

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$

Hex Color Code

^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$

Slug (URL-friendly string)

^[a-z0-9]+(?:-[a-z0-9]+)*$

Credit Card (general)

^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13}|6(?:011|5[0-9]{2})[0-9]{12})$

Whitespace-only string

^\s*$

Regex in JavaScript

JavaScript has two ways to create regex:

// Literal syntax (preferred for static patterns)
const re = /\d+/g;

// Constructor (for dynamic patterns)
const pattern = '\\d+';
const re = new RegExp(pattern, 'g');

Key Methods

MethodReturnsUse case
str.match(re)Array of matches (or null)Get all matches with g flag
str.matchAll(re)Iterator of match objectsGet matches with capture groups
str.search(re)Index of first match (or -1)Test existence, get position
str.replace(re, replacement)New stringReplace matches
str.replaceAll(re, replacement)New stringReplace all (requires g flag)
str.split(re)Array of substringsSplit on pattern
re.test(str)true / falseFast existence check
re.exec(str)Match object (or null)Low-level, iterates with g flag

Replace with Function

const result = 'hello world'.replace(/\b\w/g, char => char.toUpperCase());
// → "Hello World"

Named Groups in Replace

const date = '2026-04-14';
const formatted = date.replace(
  /(?<y>\d{4})-(?<m>\d{2})-(?<d>\d{2})/,
  '$<d>/$<m>/$<y>'
);
// → "14/04/2026"

Regex in Python

Python’s re module provides full regex support.

import re

# Compile for reuse (faster when using pattern multiple times)
pattern = re.compile(r'\d+')

# Common functions
re.match(r'^\d+', text)        # Match at beginning only
re.search(r'\d+', text)        # Search anywhere
re.findall(r'\d+', text)       # Return list of all matches
re.finditer(r'\d+', text)      # Return iterator of match objects
re.sub(r'\d+', 'N', text)      # Replace matches
re.split(r'\s+', text)         # Split on pattern

Always use raw strings (r'...') for regex in Python to avoid double-escaping backslashes.

Named Groups in Python

m = re.search(r'(?P<year>\d{4})-(?P<month>\d{2})', '2026-04')
print(m.group('year'))   # "2026"
print(m.group('month'))  # "04"

Frequently Asked Questions

What’s the difference between + and *? + requires at least one match; * allows zero matches. Use + when the element must appear at least once.

Why does my . not match newlines? By default, . matches any character except newline. Enable the dotall flag (s in JavaScript, re.DOTALL in Python) to make . match newlines too.

What does ^ mean inside and outside [ ]? Outside brackets, ^ anchors to the start of the string. Inside brackets [^abc], it negates the character class (match anything except the listed chars).

When should I use (?:...) vs (...)? Use non-capturing (?:...) whenever you need grouping (for quantifiers or alternation) but don’t need to reference the matched text later. It’s slightly faster and cleaner.

How do I match a literal special character like . or *? Escape it with a backslash: \. matches a literal period, \* matches a literal asterisk.


Ready to test these patterns? Open the Genbox Regex Tester and try them live.