Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Trying to find/remove anything between (and including) keyword tags/text in HTML.

Last post 11-07-2008, 1:50 PM by NickBG. 2 replies.
Sort Posts: Previous Next
  •  11-06-2008, 2:59 PM 48043

    Trying to find/remove anything between (and including) keyword tags/text in HTML.

    Goal:
    Remove the following HTML code from hundreds of pages...

         <tr>
        <td colspan="6"><hr />&nbsp;&nbsp;&nbsp;&nbsp;&raquo;<font color="#3300FF">random text will be here<br />
    <br />
    more random text may or may not be here</font></td>
      </tr>

    Method:

    I'm using a program called grepWin that allows me to find and replace text, and it works very well, but I'm not as regex litterate as I'd like to be. Everything I've tried so far either doesn't match or ends up matching way too much.

    Basically, I want to tell that anytime it finds the following code...

         <tr>
        <td colspan="6"><hr />&nbsp;&nbsp;&nbsp;&nbsp;&raquo;<font color="#3300FF">

    ...to match it and everything after it until it runs into a

    </tr>

    Any help is greatly appreciated!

  •  11-06-2008, 8:41 PM 48048 in reply to 48043

    Re: Trying to find/remove anything between (and including) keyword tags/text in HTML.

    From the quick playing I've just done with grepWin, I think you can try:

    <tr>((?!</tr>).)*</tr>

    as this will find a "<tr>" tag and then locate the next following "</tr>" tag.

    You can also try

    <tr>.*?</tr>

    but (in my opinion) this is not quite as strong as my first suggestion.

    If you want to further refine the search (as your question seems to suggest, then simply add the other text after the initial "<tr>" in the pattern (possibly substituting '\s+' where the line breaks occur).

    By the way, unless I've missed something the help function of grepWin is terrible and the 'F1' on-line help displayed form within the 'Search for' text field is misleading at best and more likely hopelessly inadequate! I have no idea which flavour of 'regex' is used (ie which library - .NET, PCRE....). When I first looked at the display I thought that this would not be possible, but it was accepted and seems to run (at least in the 'Test Regex' window).

    Susan

  •  11-07-2008, 1:50 PM 48087 in reply to 48048

    Re: Trying to find/remove anything between (and including) keyword tags/text in HTML.

    This worked great Susan, thanks for your help.

    About grepWin: It's the first text replacement program I found that seemed to do what I wanted it to. I'm sure there's something better, but this (in combination with your code) works like a charm for me.

     Thanks again! :)

View as RSS news feed in XML