gabru:i am not sure where to put the pattern you have mentioned? i mean how do i combine it with my url-matching-pattern?
Just tio answer this part of your question, I was suggesting that you put it in front of the appropriate part of your pattern.
The way the (non-POSIX) regex engine processes alternatives is that it tries to match then in the same order they are specified and will 'match' the first one that succeeds (it may try later ones if it backtracks to the alternation again). Therefore a pattern such as:
\w+?(a|b|c)
will match alphanumeric characters to the first "a", "b" or "c" that it finds. If the next character it tests is an 'a' then it will immediately consider the alternation successfully matched and carry on without checking the 'b' or 'c' possibilities.
In you case, you want to match something unless it is within "code" tags. Therefore you can use this alternation behaviour as follows (pseudo-code):
(match code tagged text | match something else)
You have already written the "match something else" part, and I was suggesting that you add in the "match code tagged text" part with my pattern fragment.
By the way, alternation is rather different to many of the other regex operators which generally apply to the single item immediately to their left. In other words, the '?' in the pattern 'asd2?' applies ONLY to the '2' and not the 'asd'. However, alternation applies to everything on either side of it to the end of the enclosing match group (remembering that the whole pattern is effectively enclosed in a matching group #0). Therefore the pattern 'abc|def" will match the text "abc" OR "def". That is why I put parentheses around the pseudo-code above - to make sure that the alternation is limited to the part I'm interested in.
(BTW, I specified non-POSIX regex engines above because POSIX regex engines will always test all possible paths as they return the longest of all valid matches- non-POSIX engines normally return the first valid match they find).
Make sense?
Susan