Regular Expressions: A Practical Guide to Pattern Matching
Regex appears in every language and every editor. This practical guide covers the syntax you'll use daily, from basic character classes to lookaheads.
DevPulse Team
Regular expressions (regex) are a mini-language for describing text patterns. They appear in every programming language, code editor, command-line tool, and database engine. Learning them pays off immediately — the same pattern syntax works across Python, JavaScript, grep, sed, VS Code's search, and SQL.
The Building Blocks
A regex pattern is built from literals and metacharacters. A literal character like a matches the letter a exactly. Metacharacters have special meanings:
.— matches any single character (except newline by default)^— matches the start of the string (or line in multiline mode)$— matches the end of the string or line*— the preceding element zero or more times+— the preceding element one or more times?— the preceding element zero or one time (also makes quantifiers lazy){n,m}— the preceding element between n and m times|— alternation (OR): matches either the left or right side\— escape a metacharacter or use a special sequence
Character Classes
[abc] matches any single character that's a, b, or c. [a-z] matches any lowercase letter. [^abc] matches any character that's NOT a, b, or c.
Shorthand character classes save typing:
\d— any digit (same as[0-9])\w— any word character: letters, digits, underscore ([a-zA-Z0-9_])\s— any whitespace character (space, tab, newline)\D,\W,\S— uppercase versions are the inverse
Groups and Captures
Parentheses () group expressions and capture the matched text. (\d{4})-(\d{2})-(\d{2}) matches a date like 2024-04-25 and captures year, month, and day as separate groups you can reference in replacement strings or extract programmatically.
Non-capturing groups (?:...) group without capturing — useful when you need grouping for alternation but don't need the matched text.
Practical Examples
# Email (simplified — true email validation is more complex)
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
# IPv4 address
^(\d{1,3}\.){3}\d{1,3}$
# UK postcode
^[A-Z]{1,2}\d[A-Z\d]? ?\d[A-Z]{2}$
# Hex color code (with or without #)
^#?([0-9a-fA-F]{3}|[0-9a-fA-F]{6})$
# ISO 8601 date
^\d{4}-\d{2}-\d{2}$
Greedy vs Lazy Matching
Quantifiers are greedy by default — they match as much as possible. <.+> applied to <b>text</b> matches the entire string, not just <b>. Add ? to make it lazy: <.+?> matches <b> and stops. Most HTML/XML processing bugs come from greedy quantifiers eating too much.
Lookaheads and Lookbehinds
Lookaheads let you match based on what comes after without including it in the match. \d+(?= USD) matches numbers followed by " USD" but only captures the number. Lookbehinds work in the other direction. These are powerful for extracting data from structured text.
Test and debug your patterns with our Regex Tester — real-time matching with highlighted captures and a match list as you type.
Free developer tools, right in your browser.
No sign-up. No tracking. 30+ utilities for developers.
Explore DevPulse Tools →Related Articles
YAML vs JSON: When to Use Each and How to Format Both
YAML and JSON solve the same problem — structured data — but in very different ways. Here's when each is the right choice and the syntax pitfalls to avoid.
Developer ToolsURL Encoding Explained: %20, %3A, and Why They Exist
URL encoding converts unsafe characters into percent-encoded equivalents. Here's which characters get encoded, why, and the common developer mistakes.
Developer ToolsCSS Minification: What Gets Removed and Why It Matters
CSS minification removes whitespace, comments, and redundant syntax to reduce file size. Here's what's actually happening under the hood and the gains you can expect.