Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Count items - release character in edifact

Last post 05-17-2009, 8:18 PM by Aussie Susan. 2 replies.
Sort Posts: Previous Next
  •  05-17-2009, 3:45 PM 53106

    Count items - release character in edifact

    In edifact messages there a segments that are usually seprate by a '. For example: RFF+MG:18'NAD+177'. The question mark (?) is usually used as "release character". That means that if this release character is preceding a separator, the separator is a normal character without any function. For example: In RFF+Smith?'s' is read as "Smith's". The ? itself is written as "??".

    Problem: when messages are read from left to right the end of a segment is only the ' character, when there is a even count of ? preceding the ' character. For example: ??' has to interpreted as ? and the ' as the end of the segment; ???' has to be interpeted as ? and ', so there is not the end of the segment.

     Is it possible with a regex to split segment in edifact concering above rule without using further utilities?  Is the any way to detect wether there is an odd or even count of ? preceding the '?

     

    Thanks

    Kurt

     

     

     

  •  05-17-2009, 4:43 PM 53110 in reply to 53106

    Re: Count items - release character in edifact

    You failed to mention what language/application you planned to use this regex for as requesting in the posting guidelines

    If I understand your problem correctly, which I'm not sure I completely do, the pattern

    \?\W

    Would match the release and escaped character in your sample text.  I've assume base off you sample this escaping is only used for non-Alphanumerics.

     


    Michael

    "In theory, theory and practice are the same. In practice, they are not."
    Albert Einstein
  •  05-17-2009, 8:18 PM 53118 in reply to 53106

    Re: Count items - release character in edifact

    Having been there and got the scars, please DON'T try to use a regex to parse edifact messages. In general, they are not "regular" in the sense used in "regular expression". If nothing else, the majority of regex variants will only return you the LAST instance of text matched in a match group where typically you want ALL instances.

    I have written a parser that accepts an edifact message and provides the details of each message - it is MUCH easier to use, manage and maintain.

    (The code I wrote uses an old and fairly on-standard langauge. I would strongly suggest that you use YACC or LEX etc to create a suitable parser. You will be better off in the long run.)

    As for the question about detecting odd or even counts of a character, something like (untested)

    (xx)*(x)?

    will let you know if there is an odd or even number of "x" characters. The first match group will match all even instances and the second any remaining instance that makes the overall count odd. However, this also allows there to be 0 instances that can also be worked around if necessary. Ther are probably many other ways...

    Susan

View as RSS news feed in XML