Got more questions? Find advice on: ASP | SQL | XML | Windows
Welcome to RegexAdvice Sign in | Join | Help

parsing out the domain in an href tag

  •  07-02-2008, 7:16 PM

    parsing out the domain in an href tag

    hello,

     

    i would like to parse out a domain name from an a href tag. the href tag could be anything, as i am parsing through xml that could hold any kind of url in it, depending on who is entering the data on the other end.

     

    what i am doing is parsing out the domain name from a string such as this:

    <a href=\"http://www.wwf.org\" target=\"_blank\">Learn more at www.wwf.org</a> 

     i have bolded what it is i'm trying to strip out (the first wwf). it's important that it's the domain name in the actual 'href' tag, and not the display link, as the two could be different (i know, sounds weird, but it's how it is for this project).

     

    i am doing this in flash, as 3

    this is the (actionscript 3) code block (with the regex) i'm using right now:

    ***

    var toParse:String = "<a href=\"http://www.wwf.org\" target=\"_blank\">Learn more at www.wwf.org</a>";

    var pattern1:RegExp = /"http:\/\/[A-Za-z0-9\.\-]"/;
    var index1:Number = toParse.search(pattern2);

    var pattern2:RegExp = /"\.{com|net|org|edu|gov|mil|co\.}+"/;
    var index2:Number = toParse.search(pattern1);

    var domainName:String = toParse.substring(index1, index2);  >> this should be 'wwf'

    ***

    so i have 2 issues

    1.  pattern1 returns -1, so it isn't matching http://www (i want it to search for potentially anything that might come before the domain name though, just in case)

    2.  the regex in var pattern2 works great if the domain name i'm trying to isolate doesn't have com, net, org, etc in its string. if it does however, the result (domainName) is screwed up. so, e.g. if the domain is www.ninemillion.org, i get 'nin' returned, instead of 'ninemillion', because 'mil' is matched with it.

     


    i tried looking through the regex library but i wasn't able to find anything that matched what i needed. i also tried going through other reference sites. i haven't worked much with regular expressions before so it feels a bit dizzying... any help is appreciated :)

     

    sarah 

    Filed under: , , , ,
View Complete Thread