Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Regular Expression for LeadLag

Last post 08-02-2011, 7:41 PM by Aussie Susan. 5 replies.
Sort Posts: Previous Next
  •  06-12-2011, 3:02 AM 82788

    Regular Expression for LeadLag

    Hi All,

    I need regular expression to validate the text

    The text might be in any of the below format

    LeadLag(22,+@2,33)

    LeadLag(22,-@2,33)

    LeadLag(22,-@2)

    LeadLag(22,33)

    In place of 22,33 and 2 there might be any number

    I do not know much about how to generate regular expression for the string as above.

    Can anybody please help me out?

    Thanks in Advance,
    Mahesh Marathe

  •  06-12-2011, 6:58 AM 82789 in reply to 82788

    Re: Regular Expression for LeadLag

    Hi Maheshh,

     

    That will do it:

      

     ^LeadLag\(\d{1,3},[\d\+\-][@\d]?[\d]*\)?[,$]?\d{0,3}\)?$?

    I assumed that:

     

    1).The first number has 1 to 3 digits,followed by a period,

    2).the second number has 1 to 3 digits, additionally you CAN have in the first place + or -, in second place @,

     3).there CAN be a third numer, 1 to 3 digits.

     

     

    Greetz

     

     

    Artur

     

  •  06-12-2011, 9:27 AM 82790 in reply to 82789

    Re: Regular Expression for LeadLag

    Hi Artur,

    Thanks for your quick reply

    I tested this expression against -->   LeadLag(100,101+102@103,104)

    it is allowing number after first ','  and before '@'
    i.e. 101 and 102

    Our string only contains + or - and then @ after first ','
    for e.g --> ',+@103'  OR ',-@103'

    Could you please modify the expression accordingly?

    Please let me know if you need any more information?

    Regards,
    Mahesh

  •  06-13-2011, 7:44 PM 82810 in reply to 82790

    Re: Regular Expression for LeadLag

    I don't wish to be rude to either you or Artur as both of you are new to this forum, but this is an example of what happens when you don't follow the posting guidelines in the sticky note at the beginning of this forum (where it specifically requests that you don;t use made up examples), you don't specify fully the various alternative character sequences that can occur, and you don't test pattern suggestions against real examples (including negative matches).

    The first problem is that you have not told us exactly how the various numbers can be formed. For example, I think the first and (optional) 3rd numbers can be any number of digits, but the structure of the 2nd number is only defined in your last posting, not the first.

    The reason Artur's suggested pattern doesn't work (in that it would seem to match what I assume is the invalid string in your last posting) is that all of the ending part of the pattern is optional - it will match the first part and then not care if the latter parts don't match. If you spilt apart his pattern:

    ^LeadLag\(\d{1,3},
    [\d+-]
    [@\d]?[\d]*\)?
    [,$]?\d{0,3}\)?$?

    (and I've taken out the unnecessary escape characters) you will see that the first 2 lines are all required but the last 2 lines are entirely optional (as well as logically wrong in a couple of places). For "validation" (meaning an "pass/fail" result for the entire string) you have to create a pattern that will ALWAYS try to match to the end of the string and not just leave things half way. When the pattern is optional, if it does NOT match the text, then the regex engine will effectively skip over those parts and still think it is a successful match.

    I *think* the reason the last part of the all optional is that Artur is attempting to cater for the optional 3rd value in that the '[\d+-][@\d]?[\d]*\)?[,$]?' part - if re-written as '[\d+-][@\d]?\d*\)$' - would seem to be an attempt to match the 2nd parameter and a missing 3rd parameter. Similarly the '[,$]?\d{0,3}\)?$?' - if re-written as ',\d{0,3}\)$' - would appear to match when there is a 3rd parameter. (By the way, placing the '$' within a character set definition stops the '$' having its special meaning of a zero-width assertion for the end of the line - it becomes as literal match against the "dollar" character)

    However, the effect of this is to not require the pattern to match in either case. I would suggest that something like:

    ^leadlag\(\d+,[+-]?@?\d+(,\d+)?\)$

    (with the "ignore case" option set) which does several things: for a start it assumes that the first and (optional) 3rd values can be any number of digits, it follows the new rules you have specified for the 2nd number's format and makes the 3rd number optional while still forcing the match to go through to the end of the string.

    Given the set of examples:

    LeadLag(22,+@2,33)
    LeadLag(22,-@2,33)
    LeadLag(22,-@2)
    LeadLag(22,33)
    LeadLag(100,101+102@103,104)
    Leadlag(321,3@123)$123)

    (where the last one is one that I made up that also matches Artur's pattern completely), my suggested pattern will match the first 4 only.

    Susan

  •  08-02-2011, 8:30 AM 83622 in reply to 82810

    Re: Regular Expression for LeadLag

    Hi All,

    I am using below expression

    @"^((\((&?\d+(\.\d*)?)([+-/*](&?\d+(\.\d*)?))*\)|((&?\d+(\.\d*)?)([+-/*](&?\d+(\.\d*)?))*))+)([+-/*]((\((&?\d+(\.\d*)?)([+-/*](&?\d+(\.\d*)?))*\)|((&?\d+(\.\d*)?)([+-/*](&?\d+(\.\d*)?))*))+))+$"

    and it is matching for string like

    LEADLAG(24,+@2,25)
    LEADLAG(24,-@2,25)
    LEADLAG(24,+@2,0)
    LEADLAG(24,-@2,0)

    I wanted to add regular expression for some more strings like this


    LEADLAG((24+25),+@2,26)
    LEADLAG((24+25),+@2,0)
    LEADLAG((24-25),+@2,26)
    LEADLAG((24-25),+@2,0)
    LEADLAG((24+25),+@2)
    LEADLAG((24+25),-@2)
    LEADLAG(SUM(24:26),+@2,27)
    LEADLAG(SUBSTRACT(24:26),+@2,27)

    Note -In place of Number 24,25,26 and 27 there mignt be any number upto 4 digit

    It will be great if we have different regular expression for different pattern, So as to understand and integrate inside code

    Can anybody help me to create expression for validationg all above expressions

  •  08-02-2011, 7:41 PM 83623 in reply to 83622

    Re: Regular Expression for LeadLag

    For a start I cannot get your pattern to match ANY of your test string for several reasons:
    - there is no "@" in the pattern at all and yet there is one in each example - I suspect that your pattern needs to have all of the "&" characters replaced by "@"
    - the pattern is supposed to match the entire line (going by the '^' at the start and the '$' at the end) but it does not account for the "LEADLAG(" at the start of each example, nor the ")" at the end
    - you are not accounting for the "," between the first and second numbers

    If you use the following pattern (which I have split over several lines so I could see what it was trying to do) then you can match all but the last 2 of your examples:

    ^LEADLAG\(
    ((
        \((@?\d+(\.\d*)?)([+-/*](@?\d+(\.\d*)?))*\)
        |
         ((@?\d+(\.\d*)?)([+-/*](@?\d+(\.\d*)?))*)
    )+)
    (,
      [+-/*]
      ((
        \((@?\d+(\.\d*)?)([+-/*](@?\d+(\.\d*)?))*\)
        |
         ((@?\d+(\.\d*)?)([+-/*](@?\d+(\.\d*)?))*)
      )+)
    )+
    \)$

    with possibly the "multiline" and the "ignore case" options set (you didn't tell us what options you were using so I'm guessing both of these).

    There are some ways you might be able to improve your pattern. For a start I would strongly suggest that you write the '[+-/*]' character set as '[+/*-]' because otherwise the '-' is interpreted as the character set range operator and so the way you have written it, the set would match all of "+", ".", "-", "," and "/" (those are all of the characters between "=" and "/' in the ANSI character table) plus "*".

    However, if you do this then you need to understand that the comma before the 3td number has been matching against the '[+-/*]' in one of the last parts of the pattern. You  will need to make the character set that leads the 2nd half of the pattern optional to let most of your test strings match correctly. Note als that this will lead to you needing to look into the group captures for each instance of the 2nd, 3rd, 4th etc numbers that may be present.

    If you look at the2 instances where you are checking the actual numbers, each is made up of 2 alternatives and the only difference is the surrounding parentheses. You can factor:

        \((@?\d+(\.\d*)?)([+-/*](@?\d+(\.\d*)?))*\)
        |
         ((@?\d+(\.\d*)?)([+-/*](@?\d+(\.\d*)?))*)

    to be something like:

        (\()?
           (@?\d+(\.\d*)?)([+/*-](@?\d+(\.\d*)?))*
        (?(2)\))

    so that the main part of the expression is used only once - if there is a leading open parenthesis (which would be captured into match group #2 in this example) then the last part checks that match group #2 matched something and, if it does, then requires that the closing parenthesis is also present in the string. As I said, you may not think of this as an improvment.

    If you also remove some of the unnecessary parentheses, the whole pattern would look like:

    ^LEADLAG\(
    (
        (\()?
           (@?\d+(\.\d*)?)([+/*-](@?\d+(\.\d*)?))*
        (?(2)\))
    )+
    (,
      [+/*-]?
      (
        (\()?
           (@?\d+(\.\d*)?)([+/*-](@?\d+(\.\d*)?))*
        (?(10)\))
      )+
    )+
    \)$

    There are probably many more ways that this pattern could be simplified and also extended to cater for the last 2 examples which (spelling mistakes for SUBSTRACT aside) would seem to show that you are wanting to allow for some general equations and functions being present in the string. Before spending any time on this, I think you will need to tell us exactly what form these equations could take.

    A word of caution here: you have shown that the parameters can be surrounded by optional parentheses - if you are wanting to nest these more than 1 level deep (as shown in your examples) then you will need a completely different technique to have a regex pattern handle these (if you want to, look up the "balanced (or matching) parentheses problem" in Google or elsewhere).

    I suspect that you are bordering on attempting to use a regex pattern in an inappropriate situation. If your "expressions" can become even a little more complex than you have shown , then you are better off using a string/token parser rather than a regex pattern as these can handle arbitrarily complex equations with relative ease, and can often be much easier to write and maintain (not to mention use to extract the values in the equation) in the long run.

    Susan

View as RSS news feed in XML