Just to add my 2c in here, the OP's pattern is why some consider regex's hard to read, understand and maintain.
The trick is to know where to break up the pattern into individual tokens.
Also, while Jeff is correct about the impact of the '@' at the start of a C# string, you need to be clear that there are several levels of interpretation going on here. For example:
@"\\(?:.+)\\(.+)\.(.+)"
without the '@' would need to be written as:
"\\\\(?:.+)\\\\(.+)\\.(.+)"
The reason is that both the compiler and the regex pattern parser treat the '\' as a special character. The compiler looks at each '\\' pair and interprets it as a single '\', so '\\\\' comes out as '\\'. The regex parser then interprets this (also) as a literal backslash, which leaves a single, literal character of '\'.
Therefore the first version can be interpreted as:
- a literal backslash character
- a non-capture group (the '(?:' part) that matches one or more of any character (see the note below)
- another literal backslash character
- a capture group that will match 1 or more of any character
- a literal dot or full-stop character
- another capture group that will match 1 or more of any character
Several things about this pattern need to be mentioned:
1) firstly, the dot as 'matches any character' changes its meaning slightly depending on whether the 'single-line' option is used. The other name for this option is 'dot matches newline' which better explains what it really means
2) this pattern will probably cause a lot of backtracking and could take some time, depending on the size and construction of the text being scanned. The ".+" part is greedy and will grab all characters until the end of the line or text (see point 1 above). The engine then wants to match a literal backslash so it will start working its way back looking for the last backslash in the line/text. When it finds one (there is no match if it doesn't) then it will match the backslash and then grab everything to the end of the line/text again before it looks for the literal dot. Therefore it has to backtrack again until it finds one when it can again grab everything to the end of the line/text. If it cannot find a dot that follows a backslash, it will go back to the backslash it has already found and resume backtracking until it gets to the next backslash when it will do the whole process again.
I suspect that this is looking for a file path along the lines of:
\folder\file.type
which it would match with the first capture groups receiving the 'file' and the second receiving the 'type'
However, if was given
\toplevel\folder\sub.dir\dummy
then the first match group would get 'sub' and the second would get 'dir\dummy'
In the right context, this is fine, but this comes back to something that Mash keeps saying: "know your data". This can blow up in your face
Susan