Got more questions? Find advice on: ASP | SQL | XML | Windows
Welcome to RegexAdvice Sign in | Join | Help

weird linebreak problem in php preg_replace

  •  01-12-2010, 4:26 AM

    weird linebreak problem in php preg_replace

    what i'm trying to do is remove all line breaks in and around certain html (or xml if you will) elements (h[0-9], ul, li) from a string

    the following works fine for removing leading and trailing line breaks, but not the ones inside the element:

    $text = preg_replace('/(?:\s*\n\s*)*(<(?:h[0-9]|ul|li)>)([-\w\d\s]*)(<\/(?:h[0-9]|ul|li)>)(?:\s*\n\s*)*/',"$1$2$3",$text);

    what i tried to do next was replacing ([-\w\d\s]*) with \n*([-\w\d\t]*)\n*, so that $2 wouldn't contain the line breaks (i'm only concerned about line breaks right between the node and its value, not deeper inside the value)

    what happened was something i can't wrap my head around; for elements with no whitespace inside them, everything is as before (good), but elements with any whitespace in them had a line break added before and after the element

    when i test the regex in regextester.com, it seems to work fine, but the preg_replace command doesn't work like the "simulation"

    Filed under: ,
View Complete Thread