Email Validation with Regex
The definitive guide to validating email addresses -- from simple patterns to RFC 5322, and why you probably should not use the full spec.
The Problem with Email Validation
Email validation is one of the most common tasks developers face, and one of the most misunderstood. The question "what regex should I use to validate an email?" has been asked on Stack Overflow thousands of times, and the answer is more nuanced than most people expect. The truth is that truly validating an email address with regex alone is either impossible or impractical, but you can get close enough for most real-world use cases.
In this guide, we will start with the simplest possible pattern, explain its flaws, and progressively build toward more robust solutions. We will also discuss when regex is the right tool for the job and when it is not.
The Simplest Pattern (and Why It Fails)
The most basic email regex is deceptively simple:
.+@.+This pattern says: "one or more characters, then an at sign, then one or more characters." It will match user@example.com, but it will also match @@@, spaces in here@no, and no-tld@localhost. While technically user@localhost is a valid email in certain contexts, this pattern is too permissive for a user-facing form.
A Practical Pattern for Most Applications
For the vast majority of web applications, the following pattern strikes the right balance between correctness and simplicity:
^[\w.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z]{2,}$Let us break it down piece by piece:
^-- Start of string[\\w.+-]+-- The local part: one or more word characters, dots, plus signs, or hyphens@-- Literal at sign[a-zA-Z0-9-]+-- The domain name: letters, digits, and hyphens\\.-- A literal dot separating domain from TLD[a-zA-Z]{2,}-- The TLD: two or more letters (covers .com, .io, .museum, etc.)$-- End of string
This pattern correctly handles the vast majority of real email addresses. It rejects obviously invalid input like "not an email" and "@missing-local.com". It does have limitations -- it won't handle quoted local parts like "unusual@chars"@example.com, IP-literal domains like user@[192.168.1.1], or internationalized domain names -- but these edge cases represent a tiny fraction of real-world addresses.
Handling Subdomains
The previous pattern does not handle addresses with subdomains, like user@mail.example.co.uk. Here is an improved version:
^[\w.+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$The key change is [a-zA-Z0-9.-]+ for the domain, which allows dots within the domain portion. This accepts user@sub.domain.example.com while still requiring a proper TLD at the end.
The RFC 5322 Standard
The internet standard that defines the format of email addresses is RFC 5322 (which superseded RFC 2822 and RFC 822). According to this specification, the following are all technically valid email addresses:
# All valid per RFC 5322
user@example.com
"very.unusual.@.unusual.com"@example.com
user@[192.168.1.1]
user@[IPv6:2001:db8::1]
"much.more unusual"@example.com
"()<>[]:,;@\\\"!#$%&'-/=?^_`{}| ~.a"@example.org
admin@mailserver1
user+mailbox/department=shipping@example.comYes, quoted strings with spaces and special characters in the local part are valid. Yes, IP addresses in brackets are valid. Yes, single-label domains (without a dot) are valid in certain network configurations. The specification is remarkably permissive.
The RFC 5322 Regex (The Monster)
A regex that fully complies with RFC 5322 is enormous. Here is a simplified version that handles most of the spec (the true complete version is even longer):
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])This is not something you want to maintain. It is difficult to read, nearly impossible to debug, and any modification risks breaking it. More importantly, matching the RFC syntax does not prove that the email address actually exists or can receive mail.
Why RFC 5322 Compliance Is Impractical
There are several compelling reasons not to use a fully RFC-compliant regex in production:
- Maintenance burden: A regex this complex is a liability. No one on your team will be able to modify it with confidence.
- False sense of security: Passing the regex does not mean the address exists or can receive mail.
valid-syntax@nonexistent-domain-abc123.compasses perfectly. - Rejects real users: Being too strict also causes problems. Some valid patterns that users actually use (like
+tags in Gmail) get rejected by overly restrictive patterns. - No practical benefit: The obscure email formats allowed by RFC 5322 (quoted strings, IP literals) are almost never used in practice. Supporting them adds complexity without value.
The HTML5 Standard
The HTML5 specification defines its own email validation pattern for <input type="email">. The pattern used by browsers is:
^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$This is intentionally more permissive than RFC 5322 in some areas and more restrictive in others. It is a pragmatic choice that handles the vast majority of real-world email addresses while remaining reasonably readable. If you want to match browser behavior, this is the pattern to use.
Best Practices for Email Validation
After years of collective industry experience, the recommended approach to email validation has converged on a multi-layer strategy:
1. Client-Side: Use a Simple Regex
On the client side, use a simple pattern to catch obvious errors (missing @ sign, no domain). The goal is quick user feedback, not exhaustive validation:
// Good enough for client-side validation
const emailRegex = /^[\w.+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
function isValidEmail(email: string): boolean {
return emailRegex.test(email.trim());
}2. Server-Side: Validate Format + DNS
On the server side, validate the format and optionally check that the domain has MX (mail exchange) records. This confirms the domain is configured to receive email:
// Server-side: check MX records after format validation
import dns from 'dns/promises';
async function validateEmailDomain(email: string): Promise<boolean> {
const domain = email.split('@')[1];
try {
const records = await dns.resolveMx(domain);
return records.length > 0;
} catch {
return false;
}
}3. The Gold Standard: Send a Verification Email
The only way to truly confirm an email address is deliverable is to send a verification email with a unique link. This is standard practice for account registration and is the most reliable validation method. No regex can replace it.
Common Mistakes to Avoid
When implementing email validation, watch out for these frequent pitfalls:
- Rejecting
+in the local part: Many users use+tags for filtering (e.g.,user+shopping@gmail.com). Always allow the+character. - Limiting TLD length: New TLDs like
.museum,.photography, and.internationalare longer than 3 characters. Use{2,}not{2,4}. - Requiring specific TLDs: Do not hardcode a list of valid TLDs. New ones are added regularly.
- Case sensitivity: Email local parts are technically case-sensitive per the spec, but in practice almost all mail servers treat them as case-insensitive. Do not reject based on case, and consider normalizing to lowercase for storage.
- Trimming whitespace: Always trim the input before validation. Leading and trailing spaces are a common source of false negatives.
Testing Your Email Regex
Whatever pattern you choose, test it against these edge cases:
# Should PASS
user@example.com
firstname.lastname@example.com
user+tag@example.com
user@sub.domain.example.com
user123@example.co.uk
user-name@example.org
_______@example.com
# Should FAIL
plainaddress
@missing-local.com
user@
user@.com
user@-domain.com
user@ example.com
user @example.comTry these test cases with our Regex Tester to see which patterns pass and fail for your chosen regex.
Similar Challenges: URL Validation
URL validation faces similar trade-offs between strict spec compliance (RFC 3986) and practical usability. If you are interested in how other common formats handle regex validation, check out our Regex Cheat Sheet for patterns that cover URLs, IP addresses, dates, and more.
Summary
For most applications, use a simple, readable regex for client-side format checking, verify the domain has MX records on the server, and confirm deliverability by sending a verification email. Resist the temptation to use the full RFC 5322 regex -- it adds complexity without meaningful benefit. The goal of validation is to catch typos and obvious errors, not to enforce a specification that even the spec authors acknowledge is impractically broad.
Further Reading
- RFC 5322 — Internet Message Format
The IETF specification defining the addr-spec grammar for email addresses.
- OWASP Input Validation Cheat Sheet
OWASP guidance on validating user input including email addresses.
- HTML5 email input specification (WHATWG)
The browser-native email validation regex from the HTML Living Standard.