I use c# and .NET.
I have this text string where i need to extract different fields seperated by one or more spaces.
All fields can be in optional quotation marks.
Here is the line with the fields:
#VER A 365 20080722 "This \"is\" some text" 20090614
I have gotten this far:
#VER A 365 20080722
Using this expression.
^\s*\#VER\s+("?)(?<type>\w+)\1\s+("?)(?<number>\d+)\2\s+("?)(?<date>\d{8})\3\s+
The problem is the text part. This field allow spaces if it's in quotation marks, otherwise spaces are not allowed.
The field also allows quotation marks to be part of the text if it is preceded with a backslash(\").
Text field examples:
"This \"is\" some text" -> This "is" some text
"JustAWord" -> JustAWord
\"JustAWord\" -> "JustAWord"
The text allow all characters except [\x00-\x1F\x7F].
So [^\x00-\x1F\x7F] for text with and [^\x00-\x1F\x7F\s] for text without quotation marks can be used.
How do i extract this text?
Help very much appreciated!