For a start I cannot get your pattern to match ANY of your test string for several reasons:
- there is no "@" in the pattern at all and yet there is one in each example - I suspect that your pattern needs to have all of the "&" characters replaced by "@"
- the pattern is supposed to match the entire line (going by the '^' at the start and the '$' at the end) but it does not account for the "LEADLAG(" at the start of each example, nor the ")" at the end
- you are not accounting for the "," between the first and second numbers
If you use the following pattern (which I have split over several lines so I could see what it was trying to do) then you can match all but the last 2 of your examples:
^LEADLAG\(
((
\((@?\d+(\.\d*)?)([+-/*](@?\d+(\.\d*)?))*\)
|
((@?\d+(\.\d*)?)([+-/*](@?\d+(\.\d*)?))*)
)+)
(,
[+-/*]
((
\((@?\d+(\.\d*)?)([+-/*](@?\d+(\.\d*)?))*\)
|
((@?\d+(\.\d*)?)([+-/*](@?\d+(\.\d*)?))*)
)+)
)+
\)$
with possibly the "multiline" and the "ignore case" options set (you didn't tell us what options you were using so I'm guessing both of these).
There are some ways you might be able to improve your pattern. For a start I would strongly suggest that you write the '[+-/*]' character set as '[+/*-]' because otherwise the '-' is interpreted as the character set range operator and so the way you have written it, the set would match all of "+", ".", "-", "," and "/" (those are all of the characters between "=" and "/' in the ANSI character table) plus "*".
However, if you do this then you need to understand that the comma before the 3td number has been matching against the '[+-/*]' in one of the last parts of the pattern. You will need to make the character set that leads the 2nd half of the pattern optional to let most of your test strings match correctly. Note als that this will lead to you needing to look into the group captures for each instance of the 2nd, 3rd, 4th etc numbers that may be present.
If you look at the2 instances where you are checking the actual numbers, each is made up of 2 alternatives and the only difference is the surrounding parentheses. You can factor:
\((@?\d+(\.\d*)?)([+-/*](@?\d+(\.\d*)?))*\)
|
((@?\d+(\.\d*)?)([+-/*](@?\d+(\.\d*)?))*)
to be something like:
(\()?
(@?\d+(\.\d*)?)([+/*-](@?\d+(\.\d*)?))*
(?(2)\))
so that the main part of the expression is used only once - if there is a leading open parenthesis (which would be captured into match group #2 in this example) then the last part checks that match group #2 matched something and, if it does, then requires that the closing parenthesis is also present in the string. As I said, you may not think of this as an improvment.
If you also remove some of the unnecessary parentheses, the whole pattern would look like:
^LEADLAG\(
(
(\()?
(@?\d+(\.\d*)?)([+/*-](@?\d+(\.\d*)?))*
(?(2)\))
)+
(,
[+/*-]?
(
(\()?
(@?\d+(\.\d*)?)([+/*-](@?\d+(\.\d*)?))*
(?(10)\))
)+
)+
\)$
There are probably many more ways that this pattern could be simplified and also extended to cater for the last 2 examples which (spelling mistakes for SUBSTRACT aside) would seem to show that you are wanting to allow for some general equations and functions being present in the string. Before spending any time on this, I think you will need to tell us exactly what form these equations could take.
A word of caution here: you have shown that the parameters can be surrounded by optional parentheses - if you are wanting to nest these more than 1 level deep (as shown in your examples) then you will need a completely different technique to have a regex pattern handle these (if you want to, look up the "balanced (or matching) parentheses problem" in Google or elsewhere).
I suspect that you are bordering on attempting to use a regex pattern in an inappropriate situation. If your "expressions" can become even a little more complex than you have shown , then you are better off using a string/token parser rather than a regex pattern as these can handle arbitrarily complex equations with relative ease, and can often be much easier to write and maintain (not to mention use to extract the values in the equation) in the long run.
Susan