Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Need help with java regex pattern

Last post 02-16-2010, 5:13 PM by Aussie Susan. 3 replies.
Sort Posts: Previous Next
  •  02-15-2010, 8:16 PM 59732

    Need help with java regex pattern

    Hi guys,
     
    I'm still having problems trying to get a regular expression that works properly for me. Take a look at this sample data below. I need a regular expression that matches from a ==Operating systems== and everything below it, including sections with more than 2 == symbols and anything that includes a single equal as well. Basically I need to match all data starting at ==Title== and ending just before the next instance of ==Title2== where the title in == Title == can be anything at all. Please help me figure this one out, i've been stuck for days on this. I am using Java btw. 
    Well my sample data wont let me paste it in in teh right format but you can see it http://pastebin.com/m14152614
     If you dont want to look at that it will be in this format
     ==Title==
    anything goes here
    ===subtitle===
    more data
    ====further subtitle====
    2+2=5 
    ==Next Title== 
    next body 
     
     

  •  02-15-2010, 8:57 PM 59733 in reply to 59732

    Re: Need help with java regex pattern

    Try:

    ==\s(.*?)\s==\r?\n(((?!^==\s).)*)

    with the 'singleline' and 'multiline' options set.

    I assume that the version in "pastebin" (with spaces between the equal signs and the "title" text) is correct rather than in your note (where the spaces are absent).

    Depending on your platform, there may or many not be a carriage return before the newline character at the end of each line - if there is not then you can remove the '\r?' from the pattern (although it doesn't hurt to leave it in).

    Also this assumes that there is a single space between the equal signs and the words of the text, and that the end of the title is a space followed by 2 equal signs at the end of the line with a newline. I think I have interpreted your requirements correctly, but this should get you started anyway.

    This will stop the main matching when it finds 2 equal signs followed by a space at the start of a line even if it is not really the start of a new title (e.g. "== This is a silly line" will stop it even though there are no equal signs after the text). You can extend the termination lookahead if this is a problem.

    Susan

  •  02-15-2010, 10:28 PM 59734 in reply to 59733

    Re: Need help with java regex pattern

    Thanks for the reply, I tried your suggestion and it doesn't seem to match anything in the tester that I am using. I am using http://www.gskinner.com/RegExr/ to test it. It returns 0 matches in the data I put in pastebin.
  •  02-16-2010, 5:13 PM 59782 in reply to 59734

    Re: Need help with java regex pattern

    When I "cut and paste"d your text from pastebin into that regex tester, it would appear to use just the "\r" character as the line terminator. The pattern:

    ==\s(.*?)\s==\r(((?!^==\s).)*)

    with the same options as specified before works on that site.

    The only change is the use of '\r' instead of '\n' to match (and capture) the end of the line. This demonstrates quite nicely how you need to be aware of not only the regex variant you are using but also the platform the regex engine is running on and the way the text string is constructed.

    Susan

View as RSS news feed in XML