| # | Value |
|---|
| Pattern | Description | Example |
|---|---|---|
\d |
Match any digit | \d+ matches "123" |
\w |
Match any word character | \w+ matches "abc123" |
\s |
Match any whitespace | \s+ matches spaces, tabs, newlines |
. |
Match any character (except newline) | a.c matches "abc", "a c", etc. |
^, $ |
Match start/end of line | ^start, end$ |
*, +, ? |
0+ times, 1+ times, 0-1 times | \d*, \d+, \d? |
{n}, {n,m} |
Exactly n times, n to m times | \d{4}, \d{2,4} |
(?:...) |
Non-capturing group | (?:abc|def) |
(...) |
Capturing group | (abc) captures "abc" |
(?P<name>...) |
Named capturing group (Python-specific) | (?P<year>\d{4}) |
| Flag | Description | Effect |
|---|---|---|
re.DOTALL |
Dot matches all characters | Makes . match any character INCLUDING newlines (normally it
doesn't match newlines) |
re.IGNORECASE |
Case-insensitive matching | Makes matching ignore character case, so A matches
a
|
re.MULTILINE |
Multi-line anchors | Makes ^ and $ match the start/end of each line
|
Notes on Python regex:
re.DOTALL | re.IGNORECASE flags for pattern validation,
testing, and extraction
re.DOTALL, the dot . matches all characters
including newlinesre.IGNORECASE, all matching is done case-insensitively
(?P<name>pattern)
r"pattern" to avoid escape character issues