Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

parsing out the domain in an href tag

Last post 07-03-2008, 8:07 PM by Aussie Susan. 1 replies.
Sort Posts: Previous Next
  •  07-02-2008, 7:16 PM 43709

    parsing out the domain in an href tag

    hello,

     

    i would like to parse out a domain name from an a href tag. the href tag could be anything, as i am parsing through xml that could hold any kind of url in it, depending on who is entering the data on the other end.

     

    what i am doing is parsing out the domain name from a string such as this:

    <a href=\"http://www.wwf.org\" target=\"_blank\">Learn more at www.wwf.org</a> 

     i have bolded what it is i'm trying to strip out (the first wwf). it's important that it's the domain name in the actual 'href' tag, and not the display link, as the two could be different (i know, sounds weird, but it's how it is for this project).

     

    i am doing this in flash, as 3

    this is the (actionscript 3) code block (with the regex) i'm using right now:

    ***

    var toParse:String = "<a href=\"http://www.wwf.org\" target=\"_blank\">Learn more at www.wwf.org</a>";

    var pattern1:RegExp = /"http:\/\/[A-Za-z0-9\.\-]"/;
    var index1:Number = toParse.search(pattern2);

    var pattern2:RegExp = /"\.{com|net|org|edu|gov|mil|co\.}+"/;
    var index2:Number = toParse.search(pattern1);

    var domainName:String = toParse.substring(index1, index2);  >> this should be 'wwf'

    ***

    so i have 2 issues

    1.  pattern1 returns -1, so it isn't matching http://www (i want it to search for potentially anything that might come before the domain name though, just in case)

    2.  the regex in var pattern2 works great if the domain name i'm trying to isolate doesn't have com, net, org, etc in its string. if it does however, the result (domainName) is screwed up. so, e.g. if the domain is www.ninemillion.org, i get 'nin' returned, instead of 'ninemillion', because 'mil' is matched with it.

     


    i tried looking through the regex library but i wasn't able to find anything that matched what i needed. i also tried going through other reference sites. i haven't worked much with regular expressions before so it feels a bit dizzying... any help is appreciated :)

     

    sarah 

    Filed under: , , , ,
  •  07-03-2008, 8:07 PM 43742 in reply to 43709

    Re: parsing out the domain in an href tag

    I tried responding to this yesterday but the reply appears to require moderator approval and that appears to be still pending.

    Moderators????????

    Susan 

View as RSS news feed in XML