Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

RE for matching strings literally which may or may not contain '.'

Last post 11-08-2009, 5:20 PM by Aussie Susan. 2 replies.
Sort Posts: Previous Next
  •  11-06-2009, 5:08 AM 57185

    RE for matching strings literally which may or may not contain '.'

    I want to match based on a pattern that has a string that can contain the literal '.' I'm fed a steady stream of input strings, and I don't know in advance anything about them.

    Inputs could be, for example:  myexample.txt, dog, 1476.2.6, mail.yahoo.com, larry, etc.

    For each different string I need to construct a pattern for it.

    In other words, I want something which says: match this string literally, no matter what characters it has. Maybe more specifically: match this string of N characters literally, from N=strlen(input);

    I realize that I can grossly deal with this issue by copying over the string and whenever I see a "." (or other special character) first escaping it. Then sprintf(pattern_buf, " .... %s ... ", ..., massaged_input, ...);

    But, there must be a slicker way of doing it. One that doesn't rely on my examining and perhaps massaging each input string as it arrives. The ways I've tried have failed.

    ( There's more to it than this. This is an inner string and it only forms part of of the RE. I'm given the other parts separately, but they can contain only alphanumerics. All the fields are separated by '.', which I add myself. This may seem to lead to ambiguity of the '.' as both being a field separator as well as character in a field name. But the known positioning of the various fields eliminates this as a problem.)

     

  •  11-07-2009, 3:13 AM 57199 in reply to 57185

    Re: RE for matching strings literally which may or may not contain '.'

    Use a RegEx to match the characters you need to escape, then replace each occurrence with a "\" followed by the character. There is no other way to do this. Just like you have to escape characters in a programming language, you have to escape "metacharacters" (e.g. ^ $ .) in a regular expression.

    RegEx: [\\.*?+\{\}\[\]^$|\(\)]

    Replacement: \\$0     (a '\' followed by the 0th group - the match)


    Check out http://codesaway.info/RegExPlus/, Java regex library supporting many features including named capture groups, conditionals, numeric ranges, Perl's branch reset pattern, ...
  •  11-08-2009, 5:20 PM 57218 in reply to 57185

    Re: RE for matching strings literally which may or may not contain '.'

    If this is part of a larger pattern (as your last paragraph states), then I'm not sure what you comments about a "field separator" are all about.

    Perhaps if you give us a complete explanaiton of your problem, some "before" and "after" examples, a snippet of your code as well as the regex variant you are using (as requested in the posting guidelines in the sticky note at the beginning of this forum) it might help.

    Susan

View as RSS news feed in XML