Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Apache ReWrite is killing me...

Last post 05-09-2008, 11:49 AM by Sergei Z. 5 replies.
Sort Posts: Previous Next
  •  05-08-2008, 2:18 PM 42058

    Apache ReWrite is killing me...

    Hello all,

    First post, but I'll try not to annoy you too much. I'm currently trying to write out a nice regex for Apache's ReWrite engine so that I can use search engine friendly (SEF) URLs. The problem I'm coming across is that optional parameters (such as language) are pulling in an 'undefined' value.

    Here's and example:

     

    EXAMPLE: en/bestbuy/section-5/category-2/HP-Pavilion-a1022n

     

    All parameters EXCEPT the site-name are optional. This is because we'll assume English is the default language if none is provided, and of course if they're at the front page there will be no section, category or product identified. Note also that the language variable will always be 2 characters, no less, no more, and only a-z.

    For ReWrite purposes I need to capture the language (en), site name (bestbuy), section ID (5), category ID (2), and entire product ID (HP-Pavilion-a1022n) into the $1/$2/etc variables so I can place them in the appropriate place for the new URL.

     

    ATTEMPTS:

    Language/Site only:

    ^([en]{2}|[sp]{2})?/?(\w*)$

    Problem: Left $1 as undefined if en/sp was not present, otherwise it worked fine. This was alright at the time because if the variable is blank/undefined we can assume the default of English, but I'm not one to leave something working 95% correct.

    Entire URL string:

    ^([en]{2}|[sp]{2})?/?(\w*)/?section-(\d*)/?category-(\d*)/?(.*)$

    Problem: As you can see I was getting really frustrated and trying out really stupid crap. Works fine if all variables are in the URL string, but take out product and category and it died completely (wouldn't even pull in an 'undefined').

     

    Any help would be VERY greatly appreciated.

    - Matt 

    Filed under: , , , ,
  •  05-08-2008, 4:06 PM 42059 in reply to 42058

    Re: Apache ReWrite is killing me...

    ^((en|sp)/)?([^/]/)?(section-(\d+)/)?(category-(\d+)/)?([^/]+)$

     

    Filed under:
  •  05-08-2008, 7:08 PM 42064 in reply to 42058

    Re: Apache ReWrite is killing me...

    Matt,

    I won't attempt to provide you with a solution because I don't know enough about URL re-writing, but I would recommend that you read Michael Ash's blog entry http://regexadvice.com/blogs/mash/archive/2008/01/31/A-touch-of-Character-Class.aspx which explains a bit about character classes which you appear to be using incorrectly (given your stated requirements). "[en]{2}" will match the text 'ee' 'en', 'ne' and 'nn', only one of which is probably valid in this situation.

    Susan 

  •  05-09-2008, 9:57 AM 42077 in reply to 42059

    Re: Apache ReWrite is killing me...

    Dirtydaemon,

    Thanks for the assistance but it didn't work. $1 is returning 'en/', $2 is returning 'en', 3 - 7 return undefined, and 8 returns 'bestbuy'.

     

    Susan,

    Thanks for the tip. I've learned regex solely through trial and error and the O'Reilly pocket reference. Read over it and will give this another go.

     

    Still looking for anyone to assist, though. Thanks!

    -Matt 

  •  05-09-2008, 10:54 AM 42083 in reply to 42064

    Re: Apache ReWrite is killing me...

    I've gotten a bit closer with this one. The language is always $2, site name is always $3, section ID is $7, category ID is $9 and product SKU is $10. The only problem is that I can't get rid of the back-slash from $10 without messing up the rest of the string (stops recognizing category).

     

    Pattern:

    ((en|es)/)?(\w*)/?(((section-(\d*))?/?(category-(\d+))?/?)(.*)?)/? 

     

    Test string:

    en/bestbuy/section-5/category-2/HP-Pavilion-a1022n/

     

    Results: 

    1: (en/)
    2: (en)
    3: (bestbuy)
    4: (section-5/category-2/HP-Pavilion-a1022n/)
    5: (section-5/category-2/)
    6: (section-5)
    7: (5)
    8: (category-2)
    9: (2)
    10: (HP-Pavilion-a1022n/)
  •  05-09-2008, 11:49 AM 42084 in reply to 42083

    Re: Apache ReWrite is killing me...

    ubing non-capturing groups in your regex might help:

    (?:(en|es)/)?(\w*)/?(?:(?:(?:section-(\d*))?/?(?:category-(\d+))?/?)(.*)?)/?

    Match$1$2$3$4$5
    en/bestbuy/section-5/HP-Pavilion-a1022nenbestbuy5HP-Pavilion-a1022n

     

    test it w/ other inputs

View as RSS news feed in XML