Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

General discussion on usefulness of wildcards matching ZERO or more occurances of preceding character

Last post 06-03-2008, 7:11 PM by Aussie Susan. 1 replies.
Sort Posts: Previous Next
  •  06-03-2008, 2:27 PM 42866

    General discussion on usefulness of wildcards matching ZERO or more occurances of preceding character

    I've just come across this website and am strangely happy to realise that I am not alone in my fetish for regex's.   Finally I've found people with whom I can discuss various regular expression issues that bother me.

     For a first discussion I thought I'd ask exactly what is the point of matching zero or more occurences of a preceding character.     Stick in any character followed by a * and the regex will match every line as every line by default contains the character or it doesn't.

     
    I'm wondering what subtlety I'm missing (and I'm sure that it's me that's missing something by the way)
     

  •  06-03-2008, 7:11 PM 42874 in reply to 42866

    Re: General discussion on usefulness of wildcards matching ZERO or more occurances of preceding character

    Steve,

    On its own, as you have used it, there IS little value in it. However hen it is sued in combination with other things,then it is very useful.

    For example, in answering a question in the 'Construction' forum here,  I used the following sub-pattern:

    (?!.*?\d)

    The situation was that the poster wanted to make sure that there was at least one digit in the text (this was a password validation if I remember). Of course, the digit was allowed to be anywhere in the string and so I needed a mechanism, to skip over any non-digit characters there may have been - hence the use of the '.*?' part. However, what if the digit was the first character in the string which was perfectly legitimate according to the validation rules involved. By using the '*' quantifier, I allowed there to be 0 matches of the '.'

    Another example may be matching a telephone number where there may be optional spaces between the digit groups (i.e. 9876 5432 or 98765432). The pattern

    \d{4}\s*\d{4}

    will match either form by allowing 0 or more of the preceding 'space' item.

    There is one subtle point that can arise when you make up a complete pattern that (in effect) can be blank - something like the somewhat artificial example:

    [a-z]*\d*

    where the intent is to allow 0 or more alphabetic characters followed by 0 or more digits. While it will match "asd123" and "asd" and "123" it will also match "" - probably not what was actually wanted. It is surprisingly easy to end up with something like this when you are dealing with complex matching rules.

    Make sense?

    Susan
     

View as RSS news feed in XML