Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

negative match on start of string

Last post 07-23-2012, 8:05 PM by Aussie Susan. 1 replies.
Sort Posts: Previous Next
  •  07-19-2012, 10:48 AM 85823

    negative match on start of string

    I need an example of a JS regex that matches an embedded string. The following expression will match 'GUARDIAN' for both of the following strings.

    \b(AS\b)?\W*(THE\b)?\W*(CO\W*)?GUARDIANS?\W*(UNDER\b)?\W*(FOR\b)?\W*(OF\b)?\W*(THE\b)?\b\W*

    GUARDIAN ANGELS LLC

    GERALD DUNCAN GUARDIAN OF HAILEY DUNCAN

    I need the regex to match on the 2nd example or when not found at start of field.

    Thanks

    Gerald

  •  07-23-2012, 8:05 PM 85851 in reply to 85823

    Re: negative match on start of string

    If all you are trying to do is to see if the word is 1) present in the string and 2) not at the start, then perhaps the standard string functions can do this for you (with lower overheads than are involved with the regex engine): search for the word and then check that the start index is not 0 (or whatever JavaScript returns as the first index value).

    The "easy" way to do this as a pattern is to use:

    (?<!^)\bguardian\b

    but this requires a negative lookbehind that is not available in javascript.

    Therefore, you can use the fact that the word cannot be at the start of a line: therefore there must be some character before the start of "guardian". I've tried:

    [^^]\bguardian\b

    and

    .\guardian\b

    (with the "singleline" option not set)

    and both work but also capture the character before the "g". If all you are doing is checking to see if the work exists within the text (but not at the start) then that will do. If you want the word itself (why? you already KNOW that that it will be "guardian") then you can simply discard the first character.

    I'm not sure why you have made your pattern the way you have but I suspect that you are trying to add additional context to the match, perhaps to try to make sure that the word is not at the start of the line. If that is the case, then I would suggest that you drop all of the pattern and use something as I have suggested.

    The reason why your pattern is still matching the "guardian' at the start of the line, is that all of the first part of the patter optional. The same applies to the last part.

    There are ways of forcing the prefix to match something (and therefore ensuring that the key word is not at the start) as in:

    \b((as|the|co)\s+)+guardian\b

    and something similar for the suffix. However it can be tricky to enforce the "rule" that there must be either a prefix or a suffix - especially in javascript where there are syntax limitations that are not fond n other regex variants. Unless you really need to do this, then it is simpler to just look for what you really want and not over-complicate the pattern.

    Susan

View as RSS news feed in XML