RFC 2822 Mailbox
Someone emailed me asking if I had a regex to validate an E-mail address. I didn't so I just refered him to the regexlib I knew there were plenty there. But he said he had been there and none satisfied his criteria. Basically RRFC 2822. I was able to apply a quick fix to a regex they were trying to use, but later on when I had some time I looked closely at the spec. and took a shot at writing a regex to validate it.
First off let me say this, The RFC 2822 specification is not just and E-mail address, that's only a part of it, but I only looked at the address portion, and even with that I didn't try to handle the obsolete components.
Second nothing in the spec. that I read, limits the domain name to com, org, or any of the in use high level names. Or limits the range of an IP address. SO the spec :mailbox” might not be an E-mail address as you commonly think
Third if you need a 100% rock solid check for e-mail, don't use a regex. Yes yes regexes are great but I'm not sure you can validate to the letter of the spec with a regex. Mostly because one of the components is self-describing (ccontent). It is basically
A = B or C or D
D = (A) – A (or multiple A's) enclosed in parentheses and whitespaces.
How you would handle that with a regex? I don't know, so I fudged that part.
After playing around with it for a while I finally got it working but even before I finish I realized that If wasn't as far along as I was I would have done it differently and done it like Darren talk about here. I did later go back and do it that way in code and it's a lot easier to maintain. It's not optimized in any way, shape fashion or form. I did a few simple test and both worked. I didn't really stress them out as they both bogged down my system. I may post the regex on the regexlib or post the code here later if anyone has any interest in see them. It's a monster regex. I may just let sleeping dogs lie.
After working on both versions I realize that trying to validate to the specifications is probably too much in most cases. An E-mail regex is probably a good candidate for the 80/20 rule where it work 80% of the time but failed on those extreme rare examples.