UtilityKit

500+ fast, free tools. Most run in your browser only; Image & PDF tools upload files to the backend when you run them.

Regex Generator from Samples

Generate a regular expression from positive and negative example strings — rule-based inference with live test panel.

About Regex Generator from Samples

Writing regular expressions from scratch is error-prone. This tool takes a different approach: provide example strings that should match and optionally strings that should not, and it infers a pattern for you. The inference engine detects emails, IPs, ISO dates, hex colors, URLs, and general structures by analyzing common prefixes, suffixes, character classes, and length ranges. The generated pattern comes with a plain-English explanation of each component so you can understand and refine it. A live test panel highlights matches in green and non-matches in red as you type. Options control anchoring (^ and $) and case-insensitivity. The generated pattern is a starting point — solid for common structures and easy to adjust for edge cases.

Why use Regex Generator from Samples

Removes the need to remember regex syntax for common patterns like emails, dates, IPs, and hex codes.
Plain-English explanation of each pattern component helps you understand and modify the output.
Live test panel gives immediate feedback without switching to a separate regex tester.
Negative examples add validation — you see warnings when the pattern still matches things it should not.
Pure JavaScript, no server — works offline and processes nothing outside your browser.
Removes the need to remember regex syntax for common patterns like emails, ISO dates, IPs, UUIDs, and hex codes — paste examples and let the tool infer.

How to use Regex Generator from Samples

Paste example strings that the pattern SHOULD match in the left textarea, one per line.
Optionally paste strings that should NOT match in the right textarea for negative validation.
Check or uncheck the 'Anchored' and 'Case-insensitive' options as needed.
Click Generate Regex — the tool shows the inferred pattern with a plain-English explanation.
Paste additional strings in the Live Test area to instantly see which match in green and which do not in red.
Paste example strings that the pattern SHOULD match in the left textarea, one per line — at least 3-5 examples for reliable inference.
Optionally paste strings that should NOT match in the right textarea, so the inference engine can detect false positives and warn you.

When to use Regex Generator from Samples

Creating input validation patterns for forms (email, phone, postal code, product code).
Building search filters for log files based on real example log lines.
Generating a starting regex for an unfamiliar string format before refining it manually.
Teaching regular expression concepts by showing how examples map to pattern components.
Quick prototyping of data parsing patterns before writing production code.
Creating input validation patterns for forms (email, phone, postal code, product code, license key) without manually writing the regex.

Examples

Inferring an email validator from samples

Input: Positive examples: alice@example.com bob+filter@subdomain.example.co.uk chris.morgan@deep.subdomain.io Negative examples: foo@bar no-at-sign.example.com

Output: Pattern: ^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$ Explanation: 'one or more letters/digits/._%+-, then @, then a domain, then a dot, then a 2+ letter TLD'.

Inferring an ISO date pattern

Input: Positive examples: 2026-05-08 2026-12-31 2025-01-01

Output: Pattern: ^\d{4}-\d{2}-\d{2}$ Explanation: 'four digits, dash, two digits, dash, two digits' — matches ISO 8601 calendar dates.

Inferring a hex color pattern

Input: Positive examples: #3b82f6 #FFFFFF #000 #a1b Negative examples: 3b82f6 #GGHHII

Output: Pattern: ^#[A-Fa-f0-9]{3,6}$ Explanation: '# followed by 3 to 6 hex digits'. Warning if you also need to reject 4 or 5 hex digits, narrow to {3} or {6}.

Inferring a UUID pattern

Input: Positive examples: 550e8400-e29b-41d4-a716-446655440000 f47ac10b-58cc-4372-a567-0e02b2c3d479 123e4567-e89b-12d3-a456-426614174000

Output: Pattern: ^[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}$ Explanation: 'eight hex digits, dash, four hex digits, dash, four hex digits, dash, four hex digits, dash, twelve hex digits' — matches RFC 4122 UUIDs.

Tips

Provide at least 3-5 positive examples that represent the full range of valid inputs — coverage matters more than count, so include edge cases.
Use the negative examples textarea to catch false positives — paste strings that look similar but should be rejected (e.g. malformed emails like 'foo@bar' without a TLD).
The 'Anchored' option is on by default and is usually correct for form validation; turn it off for search-within-text use cases like grep over log files.
If the generated pattern looks too greedy, narrow it by adding a specific negative example that the current pattern matches but should not.
Refine quoted special characters: if your data contains literal dots or parentheses, the tool escapes them to `\.` or `\(` automatically.
Copy the final pattern and verify in a target-language tester (regex101 for PCRE/Python/JS, regexr.com, or your editor's find/replace) before deploying.

Frequently Asked Questions

How accurate is the generated regex?▾

For well-structured inputs like emails, dates, IP addresses, and hex colors the inference is very accurate. For free-form strings it builds a structural pattern based on character classes and length range, which may need manual refinement for edge cases. Always test with the live panel before using in production.

Why does the pattern say it misses some positive examples?▾

If the inferred pattern does not match all your positive examples it will show a warning listing the mismatches. This can happen when examples are structurally inconsistent (e.g. a mix of date formats). Review the warning and either clean up the examples or adjust the generated pattern manually.

What does anchored (^ and $) mean?▾

Anchored patterns only match when the entire string conforms to the pattern — there is nothing before ^ or after $. Without anchors the pattern can match anywhere inside a longer string. Use anchored patterns for strict validation (input fields) and unanchored for searching within text.

Can I use the generated regex in any programming language?▾

The generated patterns use standard regex syntax supported by JavaScript, Python, Java, Go, and most other languages. Character class shortcuts like \d and \w are universally supported. Paste the pattern into the regex tester for your target language to confirm.

What happens with very short or single-character examples?▾

Single-character or very short examples produce overly broad patterns because there is not enough structural information to infer specifics. Provide at least 3-5 representative examples for better results.

Does the tool use machine learning to generate patterns?▾

No. The inference is purely rule-based — it analyzes character composition, common prefixes/suffixes, and length ranges. This makes it deterministic, explainable, and works offline without any model dependencies.

Explore the category

Glossary

Anchor: The `^` and `$` metacharacters that constrain a regex to match from the start and end of a string rather than anywhere within it. `^` matches start, `$` matches end.
Character class: A set of characters matched by a single token, like `\d` for digits, `\w` for word characters, `\s` for whitespace, or `[A-Za-z]` for letters.
Quantifier: A regex token that specifies how many times the preceding element must occur, such as `{3,6}` for 3 to 6 times, `+` for one or more, `*` for zero or more, or `?` for optional.
Greedy vs lazy: Greedy quantifiers (default) match as much text as possible; lazy quantifiers (suffix `?`, e.g. `+?`) match as little as possible. Greedy `.*` between quotes consumes everything; lazy `.*?` stops at the next quote.
Lookahead: An assertion that checks what follows the current position without consuming it. Positive lookahead `(?=foo)` succeeds if `foo` is next; negative lookahead `(?!foo)` succeeds if `foo` is NOT next.
Lookbehind: An assertion that checks what precedes the current position. Positive lookbehind `(?<=foo)` and negative lookbehind `(?<!foo)`. Some older regex engines do not support lookbehind.
Capture group: Parentheses `(...)` that mark a sub-pattern for extraction. The matched text is available as group 1, 2, ... in your language's regex API. Use `(?:...)` for non-capturing grouping.
Alternation: The `|` operator chooses between alternatives. `cat|dog` matches either word. Alternation is low precedence, so `^cat|dog$` means '^cat' or 'dog$', not '^(cat|dog)$'.