Document Upload
Upload a PDF document to extract text and generate regex patterns.
Required for generating patterns using the LLM.
API endpoint for the OpenAI compatible LLM API.
Name of the LLM model to use.
PDF Preview
Upload a PDF document to preview it here.
Extracted Text
This is the text extracted from the document. The system will use this text to generate patterns automatically.

        
Patterns
Pattern Tester
Enter a pattern and test text, then click "Test" to see results.
# Value

Common Python Regex Patterns
Pattern Description Example
\d Match any digit \d+ matches "123"
\w Match any word character \w+ matches "abc123"
\s Match any whitespace \s+ matches spaces, tabs, newlines
. Match any character (except newline) a.c matches "abc", "a c", etc.
^, $ Match start/end of line ^start, end$
*, +, ? 0+ times, 1+ times, 0-1 times \d*, \d+, \d?
{n}, {n,m} Exactly n times, n to m times \d{4}, \d{2,4}
(?:...) Non-capturing group (?:abc|def)
(...) Capturing group (abc) captures "abc"
(?P<name>...) Named capturing group (Python-specific) (?P<year>\d{4})
Important Regex Flags
Flag Description Effect
re.DOTALL Dot matches all characters Makes . match any character INCLUDING newlines (normally it doesn't match newlines)
re.IGNORECASE Case-insensitive matching Makes matching ignore character case, so A matches a
re.MULTILINE Multi-line anchors Makes ^ and $ match the start/end of each line

Notes on Python regex:

  • Important: Our system uses re.DOTALL | re.IGNORECASE flags for pattern validation, testing, and extraction
  • With re.DOTALL, the dot . matches all characters including newlines
  • With re.IGNORECASE, all matching is done case-insensitively
  • Python supports named capture groups with (?P<name>pattern)
  • Use raw strings r"pattern" to avoid escape character issues
  • Lookbehinds in Python require fixed width patterns
Loading...
Processing request...