Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Named group concatenation

Last post 08-19-2008, 3:40 PM by ddrudik. 3 replies.
Sort Posts: Previous Next
  •  08-19-2008, 3:03 AM 45419

    Named group concatenation

    I'm a little naive about regex's so this might not be completely coherent.  

    I have a input string containing keywords (<KEY>/) and values (mostly terminated by a period).  Some keywords are optional.  I have cookbooked a series of regex's using named groups (see below) that when each run individually find the value and assign it to the group successfully.  However, I'm looking for a way to create a single regex and run them all simultaneously so that when the input is run in a .Net application results in a collection of Match Groups with a value.  Concatenating the regex's does not seem to work or using the OR operator ("|").

     Input Sample (linefeeds included):

    TO:      CRZ-01491287 20061118 14:39:45 1C6216C157
    FROM: A36MPQ41-04610686 20061118 14:39:44 1C6205C461









    TXT                                                                 
    LIC/NBC123. LIY/07. LIT/PC.                                                  
    NAM/AMBERG,AMANDA SUE.*RECORD DISSEMINATION RESTRICTED*                     
    SNM/56528 ROCK  POINT  ROAD. CTY/PACOMIA. STA/MN. ZIP/90457.            
    VIN/2T1CE22P63C014569. VYR/03. VMA/TOYT.                                     
    VMO/CAMRY SOLARA SE,COUPE                                                    
    EXM/AUG. DOB/19590515. STICKER:402208.

    Regex's:

    (?<License>(?<=(?:LIC/))\w{5,6}(?=(?:\s|\Z|\.)))
    (?<LicYr>(?<=(?:LIY/)).*?.(?=(?:\Z|\.)))
    (?<Name>(?<=(?:NAM/)).*?.(?=(?:\Z|\.)))
    (?<Address>(?<=(?:SNM/)).*?.(?=(?:\Z|\.)))
    (?<City>(?<=(?:CTY?/)).*?.(?=(?:\Z|\.)))
    (?<State>(?<=(?:STA?/)).*?.(?=(?:\Z|\.)))
    (?<Zip>(?<=(?:Zip?/)).*?.(?=(?:\Z|\.)))
    (?<Year>(?<=(?:VYR?/)).*?.(?=(?:\Z|\.)))
    (?<Make>(?<=(?:VMA?/)).*?.(?=(?:\Z|\.)))
    (?<Model>(?<=(?:VMO?/)).*?.(?=(?:\Z|\.|\s{2})))
    (?<Dob>(?<=(?:DOB?/)).*?.(?=(?:\Z|\.)))
    (?<Sex>(?<=(?:SEX?/)).*?.(?=(?:\Z|\.)))
    (?<Hgt>(?<=(?:HGT?/)).*?.(?=(?:\Z|\.)))
    (?<Wgt>(?<=(?:WGT?/)).*?.(?=(?:\Z|\.)))
    (?<Eye>(?<=(?:EYE?/)).*?.(?=(?:\Z|\.)))
    (?<Oln>(?<=(?:OLN?/)).*?.(?=(?:\Z|\.)))


  •  08-19-2008, 8:31 AM 45429 in reply to 45419

    Re: Named group concatenation

    It should work with concatenating with OR (|)

    (?x)

    (?<License>(?<=(?:LIC/))\w{5,6}(?=(?:\s|\Z|\.))) |
    (?<LicYr>(?<=(?:LIY/)).*?.(?=(?:\Z|\.))) |
    (?<Name>(?<=(?:NAM/)).*?.(?=(?:\Z|\.)))  |

    ..........


    http://portal-vreme.ro
  •  08-19-2008, 2:42 PM 45444 in reply to 45429

    Re: Named group concatenation

    Thanks - this sort of works, however I get a a lot of matches in the match collection of which only one is a real match.  I can hunt for it, but it's fairly inefficient: 

            string path = "Storage Card";

            public static Regex rx = new Regex(
          "(?<License>(?<=(?:LIC/))\\w{5,6}(?=(?:\\s|\\Z|\\.)))|\r\n(?<Li" +
          "cYr>(?<=(?:LIY/)).*?.(?=(?:\\Z|\\.)))|\r\n(?<Name>(?<=(?:NAM/)" +
          ").*?.(?=(?:\\Z|\\.)))|\r\n(?<Address>(?<=(?:SNM/)).*?.(?=(?:\\Z" +
          "|\\.)))|\r\n(?<City>(?<=(?:CTY?/)).*?.(?=(?:\\Z|\\.)))|\r\n(?<St" +
          "ate>(?<=(?:STA?/)).*?.(?=(?:\\Z|\\.)))|\r\n(?<Zip>(?<=(?:Zip?/" +
          ")).*?.(?=(?:\\Z|\\.)))|\r\n(?<Make>(?<=(?:VMO?/)).*?.(?=(?:\\Z" +
          "|\\.|\\s{2})))|\r\n(?<Year>(?<=(?:VYR?/)).*?.(?=(?:\\Z|\\.)))|" +
          "\r\n(?<Dob>(?<=(?:DOB?/)).*?.(?=(?:\\Z|\\.)))|\r\n(?<Sex>(?<=(?:" +
          "SEX?/)).*?.(?=(?:\\Z|\\.)))|\r\n(?<Hgt>(?<=(?:HGT?/)).*?.(?=(?" +
          ":\\Z|\\.)))|\r\n(?<Wgt>(?<=(?:WGT?/)).*?.(?=(?:\\Z|\\.)))|\r\n(?" +
          "<Eye>(?<=(?:EYE?/)).*?.(?=(?:\\Z|\\.)))|\r\n(?<Oln>(?<=(?:OLN?" +
          "/)).*?.(?=(?:\\Z|\\.)))",
        RegexOptions.IgnoreCase
        | RegexOptions.CultureInvariant
        | RegexOptions.IgnorePatternWhitespace
        | RegexOptions.Compiled
        );
                foreach (string file in Directory.GetFiles(path))
                {
                    // Get file stream
                    StreamReader stream = File.OpenText(file);
                    string text = stream.ReadToEnd();
                    stream.Close();

                    // Find matches
                    MatchCollection matches = rx.Matches(text);

                    string[] groupNames = rx.GetGroupNames();

                    // Find the named group with a non-empty value in the match collection
                    foreach (string name in groupNames)
                    {
                        // Ignore the default group
                        if (name == "0")
                            continue;
                        foreach (Match match in matches)
                        {
                            if (match.Groups[name].Value != String.Empty)
                            {
                                string value = match.Groups[name].Value;
                                // Do something with the value here.....
                            }
                        }
                    }
                }
            }

  •  08-19-2008, 3:40 PM 45447 in reply to 45444

    Re: Named group concatenation

    Here's I would do this in PHP (if my simplified rules match the data):

    <?php
    $string='TO:      CRZ-01491287 20061118 14:39:45 1C6216C157
    FROM: A36MPQ41-04610686 20061118 14:39:44 1C6205C461

     

     

     

     

    TXT                                                                 
    LIC/NBC123. LIY/07. LIT/PC.                                                  
    NAM/AMBERG,AMANDA SUE.*RECORD DISSEMINATION RESTRICTED*                     
    SNM/56528 ROCK  POINT  ROAD. CTY/PACOMIA. STA/MN. ZIP/90457.            
    VIN/2T1CE22P63C014569. VYR/03. VMA/TOYT.                                     
    VMO/CAMRY SOLARA SE,COUPE                                                    
    EXM/AUG. DOB/19590515. STICKER:402208. ';

    //create the $matches array
    preg_match_all('#(\w+)/([^.\r]+)#',$string,$matches);

    //create result array from capture groups
    $result = array_combine($matches[1],$matches[2]);

    //print result array
    echo "<pre>".print_r($result,true);
    ?>

    Here's the result:

    Array
    (
        [LIC] => NBC123
        [LIY] => 07
        [LIT] => PC
        [NAM] => AMBERG,AMANDA SUE
        [SNM] => 56528 ROCK  POINT  ROAD
        [CTY] => PACOMIA
        [STA] => MN
        [ZIP] => 90457
        [VIN] => 2T1CE22P63C014569
        [VYR] => 03
        [VMA] => TOYT
        [VMO] => CAMRY SOLARA SE,COUPE                                                     
        [EXM] => AUG
        [DOB] => 19590515
    )
    

View as RSS news feed in XML