Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Iterating and extracting text between two fixed strings

Last post 07-06-2009, 10:33 AM by ddrudik. 5 replies.
Sort Posts: Previous Next
  •  07-02-2009, 11:24 AM 54507

    Iterating and extracting text between two fixed strings

    Hello guys,

    I'm new here. I need your help. I want to iterate and extract some text between two fixed string, for example:

    fstring

    text1...

    text2...

    fstring

    text3...

    text4...

    fstring

    Now I want to iterate over the fixed string "fstring" just to get

       1           2        and  so on...

    text1...   text3...

    text2...   text4...

    can you help me?

     

    thanks

  •  07-02-2009, 11:53 AM 54510 in reply to 54507

    Re: Iterating and extracting text between two fixed strings

    What platform?
  •  07-02-2009, 6:05 PM 54523 in reply to 54510

    Re: Iterating and extracting text between two fixed strings

    .net c# 2005 (Framework 2.0)

     

    However the most important thing for me is how to build a regex. I thought something like: fstring([A-zaz0-9_: \n\r]*)fstring so that I can capture what I need

    in round brakets. The problem is that not only now "fstring ... fstring" is good but even "fstring ... fstring ... fstring" so I have an output like:

    text1...

    text2...

    fstring

    text3...

    text4...

    and it is not good for me. I have added the key "?"

                                           fstring([A-zaz0-9_: \n\r]*?)fstring

    but it doesn't work.

  •  07-02-2009, 7:46 PM 54526 in reply to 54523

    Re: Iterating and extracting text between two fixed strings

    Firstly, Doug's question about the platform should not be dismissed ("However the most important thing for me is to build a regex") as it is really central to our being able to answer your question. Different regex variants that run on different platforms each have different capabilities and ways to approach the same problem.

    Because you are using the .NET regex, you can use something like

    ^fstring\r\n((?!^fstring)([^\r\n]*)\r\n)+

    with the 'multiline' option set on (and possibly the 'ignore case' option on if that is applicable). Depending on any processing you may have done on the text before you use the regex, the '\r's in the pattern may not be needed - they were on the .NET based regex tester I used with your example text that was simply cut-and-pasted.

    This assumes that the "fstring" text is always at the start of a line and that there is always at least one line after that. Further it assumes that each line is to be captured in its entirety.

    Each match will begin with the "fstring" text and will match everything up to but not including the next "fstring". If you look at the captures of match group #2 you will find each of the lines that follow the matched "fstring".

    Susan 

  •  07-03-2009, 5:26 AM 54531 in reply to 54526

    Re: Iterating and extracting text between two fixed strings

    thanks, but It doesn't work...

    I solved the problem with this regex: ^fstring\n([A-Za-z0-9_]*)\n enabling the multiline option. It works without multiline option just omitting the ^ character.

    Just replace "fstring" with anything you want and it should work.

    It seemed hard for me but it has turned out very simple. Thanks to all.

  •  07-06-2009, 10:33 AM 54759 in reply to 54531

    Re: Iterating and extracting text between two fixed strings

    Joseph:

    thanks, but It doesn't work...

    I solved the problem with this regex: ^fstring\n([A-Za-z0-9_]*)\n enabling the multiline option. It works without multiline option just omitting the ^ character.

    Just replace "fstring" with anything you want and it should work.

    It seemed hard for me but it has turned out very simple. Thanks to all.

    Making up examples is problematic for those answering questions here, your pattern does not match the example you provided in a number of ways.
    As well, it seemed your question was to match the lines between "fstring" lines but your pattern only matches the first following line as long as no characters other than "A-Za-z0-9_" exist on that line.


View as RSS news feed in XML