Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

RegEx ReMarks

Some opinions, advice, and rants about regular expressions.

  • April RegexLib Update

    I've made some more improvements to the Regular Expression Library today.  Here's a summary:

    • Added a new Contributors page to make it easier to find expressions by user.
    • Improved the Navigation and look and feel of the top menu
    • Added links to blogs and forums to main nav
    • Added links to contributors page to side nav

    Not a huge update, but I've had a few users contact me asking how they can find their expressions, so now it should be pretty simple to do so from the Contributors page.  You should also be able to log in and view My Regular Expressions.  If that doesn't work, please let me know ("It works for me").

    Sponsor
  • Regular Expression Library Update

    I updated Regular Expression Library last night and sent out emails to about 700 people letting them know.  I apologize that at the moment I neglected to include a way to edit your profile or reset your password, so for now users are stuck with ugly random strings as passwords, but I should have that corrected in the next day or two, at which point I'll send out another email (the last I promise).

    The new site is running on its own separate database (it had been using the same one as ASPAlliance.com), has implemented some more efficient data access and cache techniques, and is now using ASP.NET 2.0 (which for now means little to users but will mean easier updates in the future).  Please let me know if you find anything wrong with the site (via the site's Contact page).

    Sponsor
  • Regex 101 Posts by Eric Gunnerson

    I just ran across Eric Gunnerson's Regex 101 category of posts, which is dedicated to helping folks learn about regular expressions through a series of articles (I actually ran across his posts via the Regex Category of DotNetSlackers.com, a nice news site).  Anyway, check it out.  Many of the answers are in RegExLib.com, from what I saw (and if not, add them!).
    Sponsor
  • Moved to CS 1.0

    Testing a new post now that the site has moved to Community Server 1.0.
    Sponsor
  • You want to use a regex to do WHAT?

    This is in response to the comments and blog posts made on Jeffrey Schoolcraft's and Darren Neimke's blogs.

    I have to disagree with the blanket statement that regular expressions should never, ever be used to try and match email, date, or HTML patterns.  I'm almost at a loss here to even get into it, because the first thing that comes to mind is to just say "Oh come on now, seriously!"  Granted the comment was left on April Fool's Day (although the server claims otherwise)... Anyway, in order to disprove the blanket statement, it is only necessary to come up with a single useful application for such expressions, preferably one for which a more code-intensive approach would not be well-suited.

    How about data entry?  I've written a few call center applications.  It's also common these days to use web-based data entry applications on intranets because of the easy updates web-based applications allow for.  With any data entry application, speed of entry is important, but so is accuracy.  The best time to correct something is as soon as it's been entered (through input masking or perhaps OnBlur).  To accomplish this in a web application typically means JavaScript.  JavaScript supports regular expressions quite well, but isn't really well-suited to communicating with mail servers.

    So, ideally, if you want the best user experience, you'll use a regular expression for the emails, dates, phone numbers, etc. in the application tied to javascript.  Then, when a record is submitted, you can perform better validation.  It might be sufficient just to check that the email was well-formed and the date properly formatted with the regex, while the application uses code to try to verify that the email is valid (or ideally, requires the user to confirm it by responding to an emailed inquiry).  The application would also be responsible for ensuring that dates, beyond being properly formatted, are valid for whatever purpose they're serving (date of birth probably should be in the past, for instance).

    Thus, regular expressions can provide 'good enough' validation to allow for most typographic errors to be caught and enhance the user experience by providing immediate feedback in scenarios where code-intensive validation is impractical.  Better validation should then be performed elsewhere, usually in some middle tier business rule enforcing area.

    This is the most obvious example, and one I've personally encountered many times as a consultant at different companies.  I'm sure there are other scenarios where the quickness and simplicity of a regex outweighs the value of the additional accuracy that a more code intensive approach might provide.  Again, it's a trade-off of what is 'good enough' and in many cases a regex will suffice even in situations where it doesn't catch every possible error.

    Sponsor
  • Regex Character Classes

    I'm writing an article on regular expressions for msdn online and I've just finished a section on character classes.  Character classes, of course, are those sections of regular expression patterns enclosed in hard braces [ ].  One point I'm trying to express in the article which I think is a point of confusion for many people when dealing with regular expressions is that the grammar for character classes is wholly different from that of the rest of regular expressions.  They are basically a separate language within the language, with their own special syntax rules and metacharacters.  For instance, while $, ., and \ are metacharacters anywhere else in a regular expression, they are simply literals within character classes.  Likewise, although the hyphen - is nothing special in a regular expression, it has special meaning within character classes.

    This is sometimes confusing, as you'll see people trying to create a ZIP+4 expression with an escaped hyphen because they think it's a metacharacter (wrong: \d{5}\-\d{4}   right: \d{5}-\d{4}).  Or trying to escape characters within a character class, like to match a $ or  period in a character class one might try this (wrong): [\$\.]  instead of this (right): [$.].

    Another common misconception is that character classes can be used to match words, like expecting [img] to match the string 'img' in a pattern.  Sure it'll match it if you give it a quantifier ([img]{3}), but it would also match 'iii' or 'gmi' or any other 3-letter combination of those three characters.  Just remember, character classes only refer to a single character in the pattern -- you can't use them for words (well, not words that need to have a specific order and composition of letters, anyway).

    Sponsor
  • First Post

    Thanks Darren and Scott for helping get regexadvice.com/blogs set up!

    Steve (regex stuff will come) Smith

    PS - Check out RegExLib.com if you don't know about it!

    Sponsor