Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Extract text field inside quotation marks

Last post 03-25-2010, 10:25 AM by MultiMarine. 0 replies.
Sort Posts: Previous Next
  •  03-25-2010, 10:25 AM 62318

    Extract text field inside quotation marks

    I use c# and .NET. 

    I have this text string where i need to extract different fields seperated by one or more spaces.
    All fields can be in optional quotation marks.

    Here is the line with the fields:

    #VER A 365 20080722 "This \"is\" some text" 20090614

    I have gotten this far:

    #VER A 365 20080722

    Using this expression.

    ^\s*\#VER\s+("?)(?<type>\w+)\1\s+("?)(?<number>\d+)\2\s+("?)(?<date>\d{8})\3\s+

    The problem is the text part. This field allow spaces if it's in quotation marks, otherwise spaces are not allowed.
    The field also allows quotation marks to be part of the text if it is preceded with a backslash(\").

    Text field examples:

    "This \"is\" some text" -> This "is" some text
    "JustAWord"   -> JustAWord
    \"JustAWord\"   -> "JustAWord"

    The text allow all characters except [\x00-\x1F\x7F].
    So [^\x00-\x1F\x7F] for text with and [^\x00-\x1F\x7F\s] for text without quotation marks can be used.

    How do i extract this text?

    Help very much appreciated!

View as RSS news feed in XML