Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Remove HTML Attributes and multiple spaces

Last post 11-19-2008, 10:55 PM by Uball. 3 replies.
Sort Posts: Previous Next
  •  11-19-2008, 10:25 AM 48508

    Remove HTML Attributes and multiple spaces

    When I copy the Excel table to Dreamweaver, it automatic covert to HTML table but in the source code, it look something like this

          <td width="200">    abcd     xyz   </td>

    I used the find and replace for fix it with this condition for first time

    find:<td\b[^>]*>(.*?)</td>
    replace:<td>$1</td>

    the code is look like this

         <td>    abcd     xyz   </td>

    second time, I used this condition

    find: {2,}\b
    replace: (only 1 space)

    this time the code is here

         <td> abcd xyz </td>

    it is very close, what I need, just the space after > and before <

    OK, this is my questions

    1. can I remove the unwanted space?
    2. is it possible to do all I expanded in one step?

    Thanks in advance!!!

     

     

  •  11-19-2008, 7:06 PM 48535 in reply to 48508

    Re: Remove HTML Attributes and multiple spaces

    I have no idea if this will work in Dreamweaver, but try:

    (<td\b)[^>]*(>)\s*|\s+(<)|(\s)\s+

    with a replacement string of

    $1$2$3$4

    with suitable translation into the correct syntax will turn the single example you have provided into what you have requested.

    Note that the order of the last two alternatives in the pattern is important.

    The first part flips your pattern on its head in that it captures the beginning and end of the opening 'td' tag which is included in the output without the intervening parts. It also leaves the closing 'td' tag alone. The second alternative looks for one or more spaces followed by a '<' or which only the '<' in included in the resulting text. The last part looks for a space followed by multiple other spaces and includes only the first one. (BTW - 'space' here actually means all whitespace characters)

    Susan

  •  11-19-2008, 9:49 PM 48543 in reply to 48535

    Re: Remove HTML Attributes and multiple spaces

    Thank for your reply, it work in Dreamweaver!, just one problem when I click find next. I made some eidt on your regex like this:

    find:(<td\b)[^>]*(>)\s*|\s+(</)
    replace:$1$2$3

    Thank!

  •  11-19-2008, 10:55 PM 48546 in reply to 48535

    Re: Remove HTML Attributes and multiple spaces

    Susan,

    After try many tests, I found your solution is the best, just put \b to the third alternate for skip the indent at start of the line.

    (<td\b)[^>]*(>)\s*|\s+(<)|\b(\s)\s+

    Many Thanks!!!

View as RSS news feed in XML