Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Michael Ash's Regex Blog

Regex Musings

Additional CSS minifying regex patterns

NOTE: All the regex referenced on this page written by me are using IgnoreCase = true

I was looking at the regexes used in the YUI Compressor to minify CSS and came up with a couple of more that I think could help the process. The code and port I was looking at was already trimming unneeded zeros used for the top, right, bottom, left values with a simple string replace. But there were three separate replaces being done. It was pretty simple to come up with a regex to handle all the cases

(Pseudo code)


string.Replace(":0 0 0 0;","0;")
string.Replace(":0 0 0;","0;")
string.Replace(":0 0;","0;")

becomes

Regex.Replace(input,":(\s*0)(\s+0){0,3}\s*;",":0;")

Pretty simple but of course I thought why stop there so I came up with a regex for all numbers

:\s*(0|(?:(?:\d*\.?\d+(?:p(?:[xct])|(?:[cem])m|%|in|ex))))(\s+\1){1,3};

The replacement string is simply ":$1;"

Once this was done next on the list was to handle cases when all the numbers are not the all the same.


For those who don't know CSS this part of the syntax basically says of the 4 possible values

1) if only one value is specified the other 3 are implied to the same value (X = X X X X)
2) if two values are specified the 1st implies the third the second implies the 4th (X Y = X Y X Y)
3) if three values are specified the second implies the 4th (X Y Z = X Y Z Y)
 
Of course minifying you want to use the shorter syntax so the following regexes convert the longer to the shorter.
The replacement string for all is the same as above "$1;"


# 4 parameters to 2 x y x y to x y

:\s*((0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex))))\s+(0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex)))))\s+\2\s+\3;


# 4 to 3 (x y z y to x y z) or 3 to 2 (x y x to x y)

:\s*((?:(?:0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex))))\s+)?(0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex))))\s+(?:0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex)))))\s+\2;

Though the make look unwieldy the longer ones just repeat one of the sub-pattern. The only real difference is what is and isn't captured.

Along those same lines I came up with similar patterns for border-style, outline-style, border-color and outline-color

border-style/outline-style

The replacement string for all is “$1-style:$2;”


(outline|border)-style\s*:\s*(none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset )(?:\s+\2){1,3};


(outline|border)-style\s*:\s*((none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset )\s+(none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset ))(?:\s+\3)(?:\s+\4);


(outline|border)-style\s*:\s*((?:(?:none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset )\s+)?(none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset )\s+(?:none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset ))(?:\s+\3);


border-color/outline-color

The replacement string for all is “$1-color:$2;”


(outline|border)-color\s*:\s*((?:\#(?:[0-9A-F]{3}){1,2})|\S+)(?:\s+\2){1,3};


(outline|border)-color\s*:\s*(((?:\#(?:[0-9A-F]{3}){1,2})|\S+)\s+((?:\#(?:[0-9A-F]{3}){1,2})|\S+))(?:\s+\3)(?:\s+\4);


(outline|border)-color\s*:\s*((?:(?:(?:\#(?:[0-9A-F]{3}){1,2})|\S+)\s+)?((?:\#(?:[0-9A-F]{3}){1,2})|\S+)\s+(?:(?:\#(?:[0-9A-F]{3}){1,2})|\S+))(?:\s+\3);


I also came up with a couple of more regexes to replace some code, but as couple of these use look-behinds they are not as portable.

These work in .Net which supports variable length lookbehinds

This pattern

\s+((?:[!{};>+()\],])|(?<={[^{}]*):(?=[^}]*}))

Is used to find character preceeded by an unneed whitespace taking care not to find colon's that are used as psuedo-selectors or psuedo-classes. Replacement string would be $1

This pattern

(?<![\x22\x27=]\s*)\#([0-9A-F])\1([0-9A-F])\2([0-9A-F])\3

is used for reducing hexadecimal values from AABBCC to ABC, the replacement string would be $1$2$3.

The patten I created was to find RGB values of th format rgb(x, y, z) where x, y and z are the integers in the range of 0-255.

rgb\s*\x28((?:25[0-5])|(?:2[0-4]\d)|(?:[01]?\d?\d))\s*,\s*((?:25[0-5])|(?:2[0-4]\d)|(?:[01]?\d?\d))\s*,\s*((?:25[0-5])|(?:2[0-4]\d)|(?:[01]?\d?\d))\s*\x29

The pattern I used was just stricter than the existing one and put each value as a group so I could just work with them with further processing. The match itself is passed to a matchevaluator to do the decimal to hex conversion.

I did a little alpha testing but most of the CSS I've written doesn't have the stuff I'm testing for. So beta testing is definitely in order. Also I don't really know CSS hacks so I don't know if any of this will have adverse effect on hacks. Other than the top/left/bottom/right parameters I mostly trying to duplicate the existing effects. I'm going to pass this info on the YUI Compressor maintainer to see if they can make use of any of this.

Published Thursday, March 27, 2008 3:55 PM by mash

Comments

 

Geert said:

It is a fun regex project to work on CSS compression. However, there is one (big) problem: generated content should never be touched (http://reference.sitepoint.com/css/generatedcontent). It could contain text that looks like CSS but is just meant to be text. Although the probability of situations like that will be rather low, I still wanted to add it as a sidenote.
March 29, 2008 3:18 PM
 

mash said:

Geert, you know I actually thought about that too because I didn't think the code I was extending dealt with that.  I was going to ask the author that same question. And since I was simply extending existing code I don't think I'm causing a new problem.  I think the existing code may cause more problems in that regard than my extensions.

April 1, 2008 3:03 AM
Anonymous comments are disabled