Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Regex Noob here

Last post 07-31-2010, 7:37 PM by mash. 11 replies.
Sort Posts: Previous Next
  •  07-30-2010, 3:21 AM 70252

    Regex Noob here

    I am looking for strings that may or may not contain one S or N in the front, followed by 2 or 3 digits, followed by a '-', followed by another two or three digits. I came up with this regex:[SN]?[0-9]{2,3}-[0-9]{2,3}

    I want to modify this regex such that there are no '-' or digits before or after the match. So i do not want to match any part of 12-10-10, or any part of S1122-3344. Above regex matches 12-10 from the first example and 122-334 from the second. Adding (?![-0-9]) to the end of above regex eliminates the 12-10 match and the 122-334 match, but now 10-10 matches. I am getting close but not quite there yet.

    Thanks for any help.

    Al

  •  07-30-2010, 4:05 AM 70253 in reply to 70252

    Re: Regex Noob here

    Apparently lookbehind is not well-supported by the major regex engines, so i cannot use something like (?<![-0-9])([SN]?[0-9]{2,3}-[0-9]{2,3})(?![-0-9]). Is there a way to achieve this without using lookbehind?
  •  07-30-2010, 4:05 AM 70254 in reply to 70252

    Re: Regex Noob here

    sorry. duplicate.

  •  07-30-2010, 4:06 AM 70255 in reply to 70252

    Re: Regex Noob here

    sorry. dupes.

  •  07-30-2010, 9:12 AM 70268 in reply to 70255

    Re: Regex Noob here

    If I understand your question correctly this could be a solution :

     [^-\d][SN]?[0-9]{2,3}-[0-9]{2,3}[^-\d]

     The delta with your solution is that  [^-\d] is added before and after the regex

     

    Regards, Tom Pester

  •  07-30-2010, 11:12 AM 70289 in reply to 70268

    Re: Regex Noob here

    The regex you have suggested includes two non-digit and non'-' character in the result. For ex, in the string 'QS123-324 ', i am only interested in matching S123-324 but your regex matches the entire string, including the leading Q and the trailing space.

     Thanks for responding.

  •  07-30-2010, 11:58 AM 70293 in reply to 70289

    Re: Regex Noob here

    Use parenthesis "()" to create a group that you later can reference

    [^-\d]([SN]?[0-9]{2,3}-[0-9]{2,3})[^-\d]

    In C# the string you are interested in is now contained in group 1 


        Regex regexObj = new Regex(@"[^-\d]([SN]?[0-9]{2,3}-[0-9]{2,3})[^-\d]", RegexOptions.Singleline);
        Match matchResults = regexObj.Match(subjectString);
        while (matchResults.Success) {
            for (int i = 1; i < matchResults.Groups.Count; i++) {
                Group groupObj = matchResults.Groups[i];
                if (groupObj.Success) {
                    // matched text: groupObj.Value
                }
            }
            matchResults = matchResults.NextMatch();
        }

     

  •  07-30-2010, 3:54 PM 70303 in reply to 70293

    Re: Regex Noob here

    Yeah, i know i could use an additional programming language to fill in the gaps but i am trying to stick with regex.

     

    Thanks for the response.

  •  07-30-2010, 5:27 PM 70304 in reply to 70303

    Re: Regex Noob here

    If grouping isn't an option (1 regex) then a lookaround construct could help here. Which language do you use that has problems with lookaround?
  •  07-30-2010, 5:41 PM 70305 in reply to 70303

    Re: Regex Noob here

    alphy:

    Yeah, i know i could use an additional programming language to fill in the gaps but i am trying to stick with regex.

     

    Thanks for the response.

    I don't understand this statement.  You must use some sort of host environment to use regex and if that is a programming language it will need it to process the result in some fashion. What language/application are you planning to do this in and what does "stick with regex" mean?

    As to your question depending on your input source the solution could simply be to add a boundary.  If your samples in your original post or the full input an end of data boundary to your original pattern should suffice.  If your input is a larger piece of text that would need a different pattern. http://regexadvice.com/blogs/mash/archive/2007/10/01/Remember-where-you-come-from.aspx


    Michael

    "In theory, theory and practice are the same. In practice, they are not."
    Albert Einstein
  •  07-31-2010, 6:23 PM 70323 in reply to 70305

    Re: Regex Noob here

    Thanks for responding Mash (Mike?). I meant that i wanted the parsing and matching logic to stay in regex so that i could use any programming language (VB, SQL, C#, etc.) that supports regexs. I am trying this in VBA but would eventually like to run this regex in Oracle PL/SQL. Any additional parsing would be a bit of a pain in PL/SQL.

    The source is a text string about 100 characters long. An ID could be anywhere in there. An ID could be preceeded by any character and followed by any character. If the preceeding or following characters are a dash or a digit, then the string could have different meaning (it could then be a date, for ex.).

     

  •  07-31-2010, 7:37 PM 70324 in reply to 70323

    Re: Regex Noob here

    alphy:

    Thanks for responding Mash (Mike?). I meant that i wanted the parsing and matching logic to stay in regex so that i could use any programming language (VB, SQL, C#, etc.) that supports regexs. I am trying this in VBA but would eventually like to run this regex in Oracle PL/SQL. Any additional parsing would be a bit of a pain in PL/SQL.

    The source is a text string about 100 characters long. An ID could be anywhere in there. An ID could be preceeded by any character and followed by any character. If the preceeding or following characters are a dash or a digit, then the string could have different meaning (it could then be a date, for ex.).

     

    While I can understand why you would want to do that, it not necessarily a practical approach to using regexes.  In truth it is a bad idea.   The reason we ask in the posting guidelines which programming language you are using is because regular expression implementation across languages is not universal and some patterns created for one implementation will not be portable.  You have no guarantee a regex written for one language will work for another.  Implementations differ in features supported and syntax,  and the differences are not minor.

     Buckley's suggestion is still very sound as it uses widely supported features, and it used a very powerful basic feature, grouping. http://regexadvice.com/blogs/mash/archive/2007/06/01/You_2700_ve-got-your-sub_2D00_matches-in-my-matches.aspx

    Logically , implementation supporting, your pattern and your coding logic doesn't change but  your implementation will, since it has to anyway. You won't be using the same code for VBScript that you'd use for C#.  You could use the same pattern and logic for C# or using a different pattern since C#'s regex engine is much more powerful.


    Michael

    "In theory, theory and practice are the same. In practice, they are not."
    Albert Einstein
View as RSS news feed in XML