kminev:This how far I got, but it keeps blowing up.
StringBuilder regExB = new StringBuilder();
//regExB.append("^(?([^\s]*?)\\s+)");
regExB.append("(?<time>[^\s]*?)\s+");
regExB.append(".*?OrderStatus\(Routed=\(\w+:\w+\s+\w\s+\w\s+");
regExB.append("(?<autosomething>\w+)\s+\d+\s+");
regExB.append("(?<cmesomething>[A-Z-])\s+\w+\w+");
regExB.append("(?<another1>\w+)\s+");
regExB.append("(?<another2>\d{4})\s+");
regExB.append(".*?");
regExB.append("Latency=(?<latency>\d+)\)$";
regExB.append("*");")");
it sort of worked with c#, but when I plug it in java no luck at all
First thing, regex support isn't universal across languages you can't alway test with one language and apply with another. That's why we ask in the posting guideline which langauge you are using. Java doesn't support the named grouped like .Net does. If you are writing a regex to use in Java, find a Java regex tester. You'll have to access the groups by their ordinal positions.
Here is a pattern you can try. I didn't have time to write pattern that validated the data ( ie that a date format was really a date) but that may be more than you need. If so it can always be added in later. I basically matched what you had highlight based on it's position in the line. You could do stricter matching.
Raw Match Pattern:
(?=.+?OrderStatus\(Routed=)((?:\d\d\.){2}\d{4})\s+((?:\d\d:){2}\d\d\.\d+).+?OrderStatus\(Routed=\S+\x20[A-Z]\x20[A-Z]\x20[A-Z]\x20([A-Z\x20]+)\x20\d+\x20(CME-[A-Z])\s\S+\s(\w{2}\x20\d+).+?(TTORD\S+).+?((?:\d{1,3}\.){3}\d{1,3})\x20exchange_order_id\x20(\w+).+?Latency=(\d+)
Match Pattern Explanation:
The regular expression:
(?im-sx:(?=.+?OrderStatus\(Routed=)((?:\d\d\.){2}\d{4})\s+((?:\d\d:){2}\d\d\.\d+).+?OrderStatus\(Routed=\S+\x20[A-Z]\x20[A-Z]\x20[A-Z]\x20([A-Z\x20]+)\x20\d+\x20(CME-[A-Z])\s\S+\s(\w{2}\x20\d+).+?(TTORD\S+).+?((?:\d{1,3}\.){3}\d{1,3})\x20exchange_order_id\x20(\w+).+?Latency=(\d+))
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?im-sx: group, but do not capture (case-insensitive)
(with ^ and $ matching start and end of
line) (with . not matching \n) (matching
whitespace and # normally):
----------------------------------------------------------------------
(?= look ahead to see if there is:
----------------------------------------------------------------------
.+? any character except \n (1 or more times
(matching the least amount possible))
----------------------------------------------------------------------
OrderStatus 'OrderStatus'
----------------------------------------------------------------------
\( '('
----------------------------------------------------------------------
Routed= 'Routed='
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
(?: group, but do not capture (2 times):
----------------------------------------------------------------------
\d digits (0-9)
----------------------------------------------------------------------
\d digits (0-9)
----------------------------------------------------------------------
\. '.'
----------------------------------------------------------------------
){2} end of grouping
----------------------------------------------------------------------
\d{4} digits (0-9) (4 times)
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ") (1 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
( group and capture to \2:
----------------------------------------------------------------------
(?: group, but do not capture (2 times):
----------------------------------------------------------------------
\d digits (0-9)
----------------------------------------------------------------------
\d digits (0-9)
----------------------------------------------------------------------
: ':'
----------------------------------------------------------------------
){2} end of grouping
----------------------------------------------------------------------
\d digits (0-9)
----------------------------------------------------------------------
\d digits (0-9)
----------------------------------------------------------------------
\. '.'
----------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
----------------------------------------------------------------------
) end of \2
----------------------------------------------------------------------
.+? any character except \n (1 or more times
(matching the least amount possible))
----------------------------------------------------------------------
OrderStatus 'OrderStatus'
----------------------------------------------------------------------
\( '('
----------------------------------------------------------------------
Routed= 'Routed='
----------------------------------------------------------------------
\S+ non-whitespace (all but \n, \r, \t, \f,
and " ") (1 or more times (matching the
most amount possible))
----------------------------------------------------------------------
\x20 character 32
----------------------------------------------------------------------
[A-Z] any character of: 'A' to 'Z'
----------------------------------------------------------------------
\x20 character 32
----------------------------------------------------------------------
[A-Z] any character of: 'A' to 'Z'
----------------------------------------------------------------------
\x20 character 32
----------------------------------------------------------------------
[A-Z] any character of: 'A' to 'Z'
----------------------------------------------------------------------
\x20 character 32
----------------------------------------------------------------------
( group and capture to \3:
----------------------------------------------------------------------
[A-Z\x20]+ any character of: 'A' to 'Z', '\x20' (1
or more times (matching the most amount
possible))
----------------------------------------------------------------------
) end of \3
----------------------------------------------------------------------
\x20 character 32
----------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
----------------------------------------------------------------------
\x20 character 32
----------------------------------------------------------------------
( group and capture to \4:
----------------------------------------------------------------------
CME- 'CME-'
----------------------------------------------------------------------
[A-Z] any character of: 'A' to 'Z'
----------------------------------------------------------------------
) end of \4
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
\S+ non-whitespace (all but \n, \r, \t, \f,
and " ") (1 or more times (matching the
most amount possible))
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
( group and capture to \5:
----------------------------------------------------------------------
\w{2} word characters (a-z, A-Z, 0-9, _) (2
times)
----------------------------------------------------------------------
\x20 character 32
----------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
----------------------------------------------------------------------
) end of \5
----------------------------------------------------------------------
.+? any character except \n (1 or more times
(matching the least amount possible))
----------------------------------------------------------------------
( group and capture to \6:
----------------------------------------------------------------------
TTORD 'TTORD'
----------------------------------------------------------------------
\S+ non-whitespace (all but \n, \r, \t, \f,
and " ") (1 or more times (matching the
most amount possible))
----------------------------------------------------------------------
) end of \6
----------------------------------------------------------------------
.+? any character except \n (1 or more times
(matching the least amount possible))
----------------------------------------------------------------------
( group and capture to \7:
----------------------------------------------------------------------
(?: group, but do not capture (3 times):
----------------------------------------------------------------------
\d{1,3} digits (0-9) (between 1 and 3 times
(matching the most amount possible))
----------------------------------------------------------------------
\. '.'
----------------------------------------------------------------------
){3} end of grouping
----------------------------------------------------------------------
\d{1,3} digits (0-9) (between 1 and 3 times
(matching the most amount possible))
----------------------------------------------------------------------
) end of \7
----------------------------------------------------------------------
\x20 character 32
----------------------------------------------------------------------
exchange_order_id 'exchange_order_id'
----------------------------------------------------------------------
\x20 character 32
----------------------------------------------------------------------
( group and capture to \8:
----------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
) end of \8
----------------------------------------------------------------------
.+? any character except \n (1 or more times
(matching the least amount possible))
----------------------------------------------------------------------
Latency= 'Latency='
----------------------------------------------------------------------
( group and capture to \9:
----------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
----------------------------------------------------------------------
) end of \9
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
Java Code Example:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
class Module1{
public static void main(String[] asd){
String sourcestring = "source string to match with pattern";
Pattern re = Pattern.compile("(?=.+?OrderStatus\\(Routed=)((?:\\d\\d\\.){2}\\d{4})\\s+((?:\\d\\d:){2}\\d\\d\\.\\d+).+?OrderStatus\\(Routed=\\S+\\x20[A-Z]\\x20[A-Z]\\x20[A-Z]\\x20([A-Z\\x20]+)\\x20\\d+\\x20(CME-[A-Z])\\s\\S+\\s(\\w{2}\\x20\\d+).+?(TTORD\\S+).+?((?:\\d{1,3}\\.){3}\\d{1,3})\\x20exchange_order_id\\x20(\\w+).+?Latency=(\\d+)",Pattern.CASE_INSENSITIVE | Pattern.MULTILINE);
Matcher m = re.matcher(sourcestring);
Int mIdx = 0;
while (m.find()){
for( int groupIdx = 0; groupIdx < m.groupCount(); groupIdx++ ){
System.out.println( "[" + mIdx + "][" + groupIdx + "] = " + m.group(groupIdx));
}
mIdx++;
}
}
}
$matches Array:
(
[0] => Array
(
[0] => 14.11.2008 00:47:46.158 | ORDERSERVER/PROD | 3180 | INFO | 00000000 | JAD714 | OrderStatus(Routed=(sts:8202 C B O Normal OS 50 CME-F FUT GE 0309 0.0 9783.5 0.0 L X: 0 W: 50 AZHK70 TTORDHKHKAN2NSO A1 GTD 06:47:46.000 No. 914979 sndr 10.131.2.123 exchange_order_id 0000JM03 sok: 0G12PA001) Latency=0
[1] => 14.11.2008 06:38:23.963 | ORDERSERVER/PROD | 3180 | INFO | 00000000 | JAD714 | OrderStatus(Routed=(sts:8202 A B O Autospreader 10 CME-F SPR GE 0000 0.0 -12.25 0.00 L X: 0 W: 10 &BENC10497 TTORDNAIXBN1BBROCHAR A1 GTD 12:38:23.000 No. 915006 sndr 10.93.81.115 exchange_order_id 0000JM0U sok: 0G2KMH007) Latency=0
[2] => 17.11.2008 00:00:03.513 | ORDERSERVER/PROD | 3988 | INFO | 00000000 | JVV714 | OrderStatus(Routed=(sts:8202 C S O Autotrader 4 CME-C FUT 6B 0309 0.0 14751 0 L X: 0 W: 4 A15085450 TTORDERERCN1EMICHAEL A1 GTD 06:00:03.000 No. 14144932 sndr 67.202.68.182 exchange_order_id 0008F6AS sok: 09093C957) Latency=16
)
[1] => Array
(
[0] => 14.11.2008
[1] => 14.11.2008
[2] => 17.11.2008
)
[2] => Array
(
[0] => 00:47:46.158
[1] => 06:38:23.963
[2] => 00:00:03.513
)
[3] => Array
(
[0] => Normal OS
[1] => Autospreader
[2] => Autotrader
)
[4] => Array
(
[0] => CME-F
[1] => CME-F
[2] => CME-C
)
[5] => Array
(
[0] => GE 0309
[1] => GE 0000
[2] => 6B 0309
)
[6] => Array
(
[0] => TTORDHKHKAN2NSO
[1] => TTORDNAIXBN1BBROCHAR
[2] => TTORDERERCN1EMICHAEL
)
[7] => Array
(
[0] => 10.131.2.123
[1] => 10.93.81.115
[2] => 67.202.68.182
)
[8] => Array
(
[0] => 0000JM03
[1] => 0000JM0U
[2] => 0008F6AS
)
[9] => Array
(
[0] => 0
[1] => 0
[2] => 16
)
)
Michael
"In theory, theory and practice are the same. In practice, they are not."
Albert Einstein