Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Trying to capture a series of values.

Last post 07-14-2010, 11:20 AM by EtanSivad. 2 replies.
Sort Posts: Previous Next
  •  07-13-2010, 5:00 PM 69530

    Trying to capture a series of values.

    Howdy all,

    I've got this incoming string "132 / 456 / 789" (Or sometimes even "123/456/789" but it should always be with / seperating the numeric values.) and I'm trying to capture just the numbers and move them into a variable.  My RegEx is: 

    ([0-9]+)(?:[^0-9]*)([0-9]+)(?:[^0-9]*)([0-9]+)(?:[^0-9]*)

    So, capture a set of numbers, don't capture a set of non-numbers, repeat 3 times.  This works great if there are exactly 3 sets of numbers.

    Here's my test BLOCKED SCRIPT

     

     

    var testVar = "123 / 456 / 789";
    var reg_exp1 = /([0-9]+)(?:[^0-9]*)([0-9]+)(?:[^0-9]*)([0-9]+)(?:[^0-9]*)/;
    var testVarReg1 = testVar.replace(reg_exp1, "$1");
    var testVarReg2 = testVar.replace(reg_exp1, "$2");
    var testVarReg3 = testVar.replace(reg_exp1, "$3");

    document.write("Capture1: " + testVarReg1);
    document.write(" Capture2: " + testVarReg2);
    document.write(" Capture3: " + testVarReg3);

     

     

     

    This produces: "Capture1: 123 Capture2: 456 Capture3: 789" which is exactly what I want.

     

    The problem is sometimes there will only be 2 numbers.  Such as "123 / 456"  This causes weird things to happen.

    Only 2 numbers produces: Capture1: 123 Capture2: 45 Capture3: 6

     

    Any suggestions on how to fix this?  Is there an easier way to accomplish this?

     

    Thanks,

  •  07-13-2010, 7:46 PM 69536 in reply to 69530

    Re: Trying to capture a series of values.

    Several suggestions come to mind. The simplest is to use the "split" function with a pattern of:

    /

    This will return an array with each element being one of the numbers (possibly with the surrounding whitespace). In this case you gave the option of using the (relatively lightweight) string "split" function or the (relatively heavyweight) regex split operation - either will do as you are splitting on a literal character.

    If you really want to follow the approach you have used, then try a pattern of:

    ^(\d+)\s*/\s*(\d+)(\s*/\s*(\d+))?$

    This is basically what you have except that I've used the '\d' shorthand for the '[0-9]' part, and I've used the actual separator character (the '/') and optional whitespaces rather than the "anything but a digit" you use - the '[^0-9]'; I've forced the string to match from the start (the '^') to the end (the '$') of the string (if the value is actually embedded within the string somewhere then take these out); and I've made the last part optional.

    If all 3 values are given, then look in match groups #1, #2 and #4 for the characters. If only 2 are given then match group #4 will contain a null string.

    This pattern is limited to matching either 2 or 3 values whereas the split approach will match any number of items, separated by a slash (but does not check that they are only digits).

    The reason your pattern behaves that way it does is that you require the 3rd value to be present. The only way this can happen is for the 2nd value match to give up a digit - its still happy as it has "one or more" but allows the 3rd part top succeed and the regex engine favours results that provide an overall success to a match.

    It would also help if you told us the regex variant you are using as requested in the posting guidelines in the sticky note at the beginning of this forum.

    Susan

  •  07-14-2010, 11:20 AM 69550 in reply to 69536

    Re: Trying to capture a series of values.

    Aussie Susan:

    It would also help if you told us the regex variant you are using as requested in the posting guidelines in the sticky note at the beginning of this forum. 

    Susan

     

    Whoops, sorry.  It's the Javascript regex variant.

     

    Thanks for the info.  I figured out my problem was really the + in ([0-9]+) since that requires a match.

    Your explination helps me understand why it's doing the weird split.  Entering ([0-9]{0,3})(?:[^0-9]*) seems to work exactly the way I intended.  I will play around with your regex as well.

     

    Thanks for the feedback.  I feel like I've got a better understanding of the logic of matching to a capture.

View as RSS news feed in XML