Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

replace space with underscore in url

Last post 12-02-2008, 2:59 AM by prometheuzz. 8 replies.
Sort Posts: Previous Next
  •  10-15-2008, 11:37 AM 47233

    replace space with underscore in url

    hello everyone,

     I have some links that contain bookmarks to other pages and do contain spaces which is not allowed in XHTML 1.0 as follow: http://somewebsite.com/index.htm#table of content

     

    I have able to get the stuff after # in the following

     <a href="http://somewebsite.com/index.htm#table of content">Table of content</a> and there are many so it'll take me sometime to do all of them manually. The regular expression I use is:

    #([^>]*)

    Now that I have gotten the stuff after # how do I search and replace within that.

     

    Any help is appreciated it

    Thanks

  •  10-15-2008, 12:29 PM 47236 in reply to 47233

    Re: replace space with underscore in url

    Jagarm:
    ...

    Now that I have gotten the stuff after # how do I search and replace within that.

    ...

    What do you mean by that? Can you provide some (some == plural!) examples?

  •  10-15-2008, 1:56 PM 47239 in reply to 47233

    Re: replace space with underscore in url

    Sounds like you want to use groups. http://regexadvice.com/blogs/mash/archive/2007/06/01/You_2700_ve-got-your-sub_2D00_matches-in-my-matches.aspx

    You didn't state what you want to replace it with so how you apply the replace is TBD. 


    Michael

    "In theory, theory and practice are the same. In practice, they are not."
    Albert Einstein
  •  10-16-2008, 11:32 AM 47278 in reply to 47236

    Re: replace space with underscore in url

    Let say I have a link as follow: <a href="index.htm#second paragraph">Statement</a>

    Note the stuff after the # has a space which is not in XHTML. So I want to replace all the spaces with _.

    Everything is done, I had to do it manually, since not all of them had only spaces some of them had the hyphen.

     

    Thanks for the replies though

  •  11-24-2008, 2:13 PM 48778 in reply to 47278

    Re: replace space with underscore in url

    I am looking for the same solution, e.g.:

    Find:

    blah, blah blah<a href="http://this is my space.com">My Space</a>blah, blah

    Replace as:

    blah, blah blah<a href="http://this_is_my_space.com">My Space</a>blah, blah

    with Dreamweaver or BBEdit, not any programming languages.

    I read that you can't do it because it requires recurvise search as there's no known number of space character in the url.

    I tested in Expresso with just the <a> tags

    (?<=<a href=")?([^(<a )"]\w+)(\s)

    Find:

    <a href="Using the Rectangle and Rectangle Primitive tools.mov">

    Result:

    <a href="Using_the_Rectangle_and_Rectangle_Primitive_tools.mov">

    but also converted "blah blah" (non-a-tag contents) into "blah_blah" as well.

    blah blah <a href="Using the Rectangle and Rectangle Primitive tools.mov">blah blah

    blah_blah_<a href="Using_the_Rectangle_and_Rectangle_Primitive_tools.mov">blah_blah

    Any help will be appreciated.

     

    Jay

  •  11-24-2008, 2:24 PM 48780 in reply to 48778

    Re: replace space with underscore in url

    Pattern:

    \x20(?=[^"]*"\s*>)

    replacement:

    _

  •  12-01-2008, 5:39 PM 49037 in reply to 48780

    Re: replace space with underscore in url

    It works in DW! Thank you very much. Expresso explains as follow:

     ///  A description of the regular expression:
    /// 
    ///  Hex 20
    ///  Match a suffix but exclude it from the capture. [[^"]*"\s*>]
    ///      [^"]*"\s*>
    ///          Any character that is NOT in this class: ["], any number of repetitions
    ///          "
    ///          Whitespace, any number of repetitions
    ///          >

    But I still don't quite understand how it works.

    For instance, what does \x20 do? How does the expression restrict the search between " and "> as in <a href="This is my url."> and not search outside the <>. Very interesting!

  •  12-01-2008, 10:27 PM 49045 in reply to 49037

    Re: replace space with underscore in url

    After staring at the expression for a whule, I finally got it. 

    The trick is using \x20 (Hex literal for space) in the find vs. \s* (also white space) in the lookahead pattern.

    Will \s(?=[^"]*"\x20*>) do the same thing?

    It does!

     

    Thanks again.

  •  12-02-2008, 2:59 AM 49053 in reply to 49045

    Re: replace space with underscore in url

    jaychow:

    After staring at the expression for a whule, I finally got it. 

    The trick is using \x20 (Hex literal for space) in the find vs. \s* (also white space) in the lookahead pattern.

    In plain English, the regex

    \x20(?=[^"]*"\s*>)

    would read as: "match any single white space that has just one double quote in front of it, followed by the closing tag '>'. The double quote can have zero or more characters of any type except double quotes before it, and the closing tag can have zero or more white space characters before it".

    jaychow:

    Will \s(?=[^"]*"\x20*>) do the same thing?

    ...

    \s will also work, but it will cause all white space characters (space, tab, new line, ...) to be replaced, not just (single) white spaces.

View as RSS news feed in XML