Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Regex According to Jeff

The right tool for the right job

IsEmailAddress? Do you really mean, IsThisFormattedSomewhatLikeEmailAddressesMightBe?

So Eric Wise posted a small little snippet IsEmailAddress that stars a long winded regular expression to determine if an input string is in fact an Email Address.  So IsEmailAddress(try.sending.this.email@____________________________________________.zzzzz”) would return True for the suspect email.

I don’t want this to come across as flaming Eric but I do feel it’s important to make a few points (and they’re not directly pointed at Eric, though the first one certainly is relevant, the rest are just more general rants):

1st, if you really want to use the email address for the purpose of sending email there are far better ways to do this.  I’ve blogged about this before but it’s worth repeating: Email Validation for every new email address and parsing your bounce logs is a sure fire way to satisfy the requirement that the email address works and is at least accessible to the registering user (whether they own it or not is a discussion for some other day and some other blog).  If you’re not into validation, which is probably the best method, you can try to poll the SMTP server for the domain to see if the account exists, which might tell you it’s a valid email but not necessarily one that will get to the registering user.

2nd, Regular Expressions are a tool, just like anything else they are only appropriate for certain situations.  This isn’t to say you can’t go out of your way to use regex to solve problems at considerable expense to yourself, just for the sake of using regex.  Religious zealots abound, look at the junior programmer in the next cube that just learned about the Facade pattern and is trying to use it on every project he ever encounters, right or wrong.  Knowing the tools aren’t enough, knowing when it’s appropriate to use which tool is far more important.

3rd, if you’re really interested in what a valid email address is go read RFC 822 browse down to page 7 and start looking through the lexical symbols.  If you think it’s an easy thing to do in a regex just check out the number of varying results from regex lib (and no, I’m not endorsing regex lib, I think it’s a fairly bad idea to insert a bit of code that you have no idea what it’s saying and just hoping that it works, always.) and you’ll see pages of tries.  If you want to see what the real regex looks like go here: http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html and it still doesn’t tell you if the email exists, only that it meets the standard.

 

Sponsor
Published Thursday, March 31, 2005 1:34 PM by jeffrey
Filed under:

Comments

 

jeffrey said:

Yup. I continue to downvote and negative-comment nearly every entry at "regex lib".

Do not validate email addresses with a regex (unless it's the full regex, as you point out).

Do not parse HTML with a regex. HTML is surprisingly complex.

Do not validate a date with a regex. All these regex I see that try to compute the number of days of february based on the year number just have me going "WTF!".

These are NOT regex tasks. These are dedicated tool tasks.

And yet, "regexlib" is full of them. And full of "it", if you know what I mean.
March 31, 2005 11:02 AM
 

jeffrey said:

To Jeff...

As usual, nice article. I think that both you and I have echoed these sentiments in the past, that is, use the right tools to validate things such as:

urls, filepaths, html, numbers, dates, times, etc...

Although it's a message that is well worth repeating.


To Randall...

thanks for taking the time to make your feelings known. I've responded with some answers on my own blog:

http://weblogs.asp.net/dneimke/archive/2005/04/01/396595.aspx


April 1, 2005 2:48 AM
 

TrackBack said:

April 1, 2005 2:46 AM
 

TrackBack said:

April 1, 2005 2:46 AM
 

TrackBack said:

April 1, 2005 5:51 PM
 

TrackBack said:

April 6, 2005 9:28 AM
 

TrackBack said:

April 6, 2005 9:28 AM
Anonymous comments are disabled