Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Michael Ash's Regex Blog

Regex Musings

  • Looking again at the Lookahead bug

    A few years ago a wrote about about a bug in Internet Explorer's Regex engine that affected patterns with lookaheads. Well the bug came back in the form of a question on RegexAdvice.com. It too was a password regex, though not as complex as the previous pattern that introduced me to this bug.

    The first pattern had three conditions that were being tested for with lookaheads.

    ^(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{8,15}$

    With the current pattern only one lookahead was being used.

    ^(?=.*?\d)[a-z][a-z0-9]{5,7}$

    In both patterns the pattern had a min and max length. In to original attempts of both the length was being checked after the lookahead(s) test. While this is perfectly fine in a non VBScript/JScript world, this is were the bug kicks in those regex engines. Actually it's probably the same regex engine for both languages, which is probably why it only effects IE it's the only browser that uses VBScript and Jscript natively. I don't recall in my original testing for this bug if I tested it server-side given my previous blog comments, mostly likely I only tested client-side. However the recent question it was failing server-side so it's not actually in the browser but more likely the DLL for those languages. Anyway the previous blog article covered the behavior that was happening.

    Steve Levithan looked much closer at the problem in general and discussed it on his blog. He came to the conclusion that the qualifiers with a minimum boundary of zero, within the lookahead were the culprit. He provides a couple of simple examples. I think he's partially right but I don't think the 1+ qualifiers are excluded from the problem. I think his provided examples were a little too simple for them to be effected.

    OK let look at the regex pattern in the recent question

    ^(?=.*?\d)[a-z][a-z0-9]{5,7}$

    The requirements were 6 to 8 alphanumeric (English) string that started with an alpha character, with at least one digit. Let's use “abc123” as the text string.

    Now the above pattern was supplied by one of the resident pros who frequent the message boards. Let's get pass the fact the pattern itself is correct and satisfies the requirements. I'm going to rewrite the pattern to

    ^(?=\D*\d)[a-z][a-z0-9]{5,7}$

    Now this is functionally the same, it's just within the lookahead the pattern is greedy instead of lazy and it will make the point I'm trying to make easier to see (I hope) without having to deal with backtracking. Now this pattern suffers from the bug in VBScript/JScript. Now Steven suggested the one or more qualifier (+) doesn't suffer from the problem but change the to a + , which doesn't effect the match because the first test after the lookahead is for an alpha so the lookahead placed as it is will always match at least one character with the \D* pattern. So now we have

    ^(?=\D+\d)[a-z][a-z0-9]{5,7}$

    which is also bitten by the bug.

    I first tried the same approach on my original attempt dealing with the bug, but no luck. I tried using the + qualifier, still didn't work. Now the person posting the question stated without the lookahead the remaining pattern worked fine, excluding the at least one digit test. So I begin testing the part of patterns such as

    ^a-z][a-z0-9]{5,7}$

    ^(?=\D*\d)[a-z]

    ^(?=\D+\d)[a-z]

    are just a few attempts, all work as expected. It was only when I put the whole pattern back together that it begin failing. But after some more trial and error I think I've come across a pattern with the bug. Going back to my original examination of the problem I discovered that this pattern ^(?=\D*\d)[a-z][a-z0-9]{2,7}$ matches the test string “abc123” now this doesn't seem to quite fit with my original assessment of what was happening because in that pattern after the lookahead test it's just testing for a certain number of any characters, but it this pattern it's looking for a specific range of characters but if you up the min boundary on the last qualifier by one to ^(?=\D*\d)[a-z][a-z0-9]{3,7}$ the pattern fails again. Now if you change the zero to infinity qualifier in the lookahead to one to infinity qualifier in the modified pattern so you get ^(?=\D+\d)[a-z][a-z0-9]{3,7}$ the pattern matches again. Bump up the min boundary of this new pattern to ^(?=\D+\d)[a-z][a-z0-9]{4,7}$ Bugaboo! The pattern fails again.

    Now I don't have access to the source code so this is just supposition but here's what I believe is happening. I think the data examined by the lookahead is being stored in a stack structure. But just looking at the patterns that are working and failing it looks like once the lookahead is satisfied values when a qualifier is encountered in the consuming portion of the pattern, the lookahead's match is reconsidered with the minimum boundary of the qualifier in the lookahead popped off the stack of the lookahead's match. Let's look at the first pattern that worked

    ^(?=\D*\d)[a-z][a-z0-9]{2,7}$

    OK we'll start with after the lookahead is matched. Now the lookahead is supposed to be non-consuming so the pointer should still be at the beginning of the string.

    ^(?=\D*\d) matches “abc1” Now the rest of the pattern matches normally until we get to the qualifier in the consuming portion of the pattern we are looking for at least two alphanumeric characters. At that point the lookahead match is reconsidered, the lower bound being zero, nothing is remove for the stack but the current character pointer now points to the character after the lookahead's match value of “abc1”.The qualified part of the pattern [a-z0-9]{2,7}$ can be satisfied by “23” so we get a match

    Now if we do the same thing with ^(?=\D*\d)[a-z][a-z0-9]{3,7}$ and apply the same logic the regex fails because the qualifier in the consuming portion is look for at least 3 characters and there aren't that many if it tries to satisfy that part of the pattern after “abc1” in the test string.

    Now let's look at ^(?=\D+\d)[a-z][a-z0-9]{3,7}$ with the same logic. The only difference from the previous pattern is the lower boundary of the lookahead qualifier. It's now 1. So if we pop 1 character of the stack of the lookahead's match of “abc1” we get “abc” leaving us with “123” to be matched by the consuming qualifier, which is just enough.

    Now take ^(?=\D+\d)[a-z][a-z0-9]{4,7}$ apply the same logic, now the pattern fails because the consuming qualifier is look for as least 4 characters after the stack popping of lookahead's match.

    If you changed the lookahead qualifier to {2,} the pattern would match. You can continue upping the qualifiers by one in the consuming portion to make the pattern not match then non-consuming part of the pattern to get the match so the behavior seems pretty consistent with my theory which the consuming qualifier is pointing to the end of the lookahead match and moving back the number of character of the lookahead minimum boundary. It also seem to explain the effect of my original encounter with the bug. It may as well as why the workaround of testing the length first with another avoids the bug because that consuming qualifier it that case is usually just , though + seem to work too which doesn't quite fit but consuming qualifiers with minimum boundaries of zero or one don't seem to be effected in any case. In all of the above test cased those values were below the minimal threshold of every successful test. However the test cases above has only one lookahead that doesn't backtrack, who know what influence backtracking, additional lookaheads or how addition qualifiers in the consuming parts pattern would be effected. Now while the test values support my theory I can't say for sure that things are happening exactly the way I've laid out. The actual mechanics may be different but whatever is happening under the hood clearly pointers are being corrupted and the regex engine is loosing it's place

    The thing that was so confusing about this was it only kicks in with a qualifier in the consuming portion is encountered but if there was something between the lookahead and the qualified portion it match normally. So this is something it's really hard to make test cases for because you get these ghost value popping up latter on in the test than I expected. Not to mention the pattern itself is correct so even with a tool that will let you step through the matching you don't get this behavior unless that tool is using VBScript's regex engine and I haven't seen such a tool for that engine.

    If you are using lookaheads with JavaScript client-side then you are going to be susceptible to the bug in IE because it will use the Jscript engine. And while you should always validate server-side if VBSCript or Jscript is your server-side language you are still at risk. So a platform like classic ASP which uses both of those languages by default is at risk client and server side, but a platform like PHP while it still suffer the bug client-side for IE, should work correctly server-side which is using a different regex engine. Same goes for non-web clients using the JScript/VBScript DLL.

    The workaround for the strong password type of regex is when using lookaheads to include the upper boundary test(s) before the no upper boundary test then use .* to consume characters. The bound test should keep the pattern from running forever. However depending on the complexity of your criteria this may not always be an option but try it first anyway.

    Sponsor
  • Validating Email Revisited

    First off let me say I'm a bit over my head here. Not regex part but host the language of the regex engine.

    Many moons ago I posted a blog article stating why you could not write a regex that validated an e-mail address 100%. Well this is still true, however in that posted I also stated that the pattern was so massive that it wasn't worth using. This is also still true however I was made aware of a flavor-specific syntax that reduces the regex from massive to very large.

    This regex is for the PCRE engine. http://www.myregextester.com/?r=337

    Though from what I've read this will work for PHP too.  Now I don't know Perl or PHP or what minimum version of PCRE supports this syntax. That being the case I also don't how well it performs. I wrote the original version using the .Net syntax and not only was the regexPublish massive, which is one reason I never posted it but the performance was terrible. Given that most people want to use this type regex to validate a data entry field, the pattern was overkill. In fact I recommend that you don't use this, except to learn from. The PCRE version may perform better but I don't have the means or time to test, so use at your own risk. For simple field validation even this is still overkill. For a large text file performance may suffer horribly. Most likely you aren't going to want to use this pattern as it is too large for simple test and performs poorly for large test.

    When I see people asking for Email regex, I point out that perfect validation is not possible. And when I see so-call email validating regex that are only about 50 characters long, it makes me chuckle. This pattern is probably to most compact version of a RFC 2822 address regex you'll find and it is still huge. Ports to other regex engines not supporting the recursive syntax will easily be 4x as large as my .Net version was.

    The above pattern does the RFC Spec up to the address-spec, which pretty much what people are thinking about when they are saying Email address.

    It not to hard to take to it up a few more level in the spec using this syntax

    RFC 2822 mailbox : http://www.myregextester.com/?r=338

    but like I said it likely won't perform well enough to be useful. The two patterns I've linked to I've wrapped in anchors so they are just matching against the whole string. Searching  for a string within a larger body, without anchors will probably degrade performance very fast.  But if any of you PHP or Perl gurus want to stress test this beast, have fun. Maybe it's not as bad as I think it may be.


    Save and Continue Writing



    Sponsor
  • Update to CSS Minification

    This is a C# 2.0 enhancement of a C# port of YUI Compressor's  CSS minification code

    I got a little carried away with ideas for this, they were all regex based which really is what motivated me to work on it. However after I thought I was done I learned not everything worked. It did what I wanted it to do but what I wanted wasn't the correct thing. I really should have just stopped with my original ideas.

    The last idea for my original changes was to take 2 or more individual subset properties and write them in shorthand notation of the main property they were a subset of. Well I got that to working. But upon testing I learned something new about CSS that I didn't know. Basically that what I was doing could alter the behavior of the presentation. Which was disappointing because I put a lot of energy into getting the results I was after.

    So it looked as all of that code was going to go to waste. But there was one scenario that what I was trying to do was alright. So the code wasn't completely wasted. The one scenario was if all the subset properties are declared then combining them is fine. I didn't bother changing the regexes I wrote for this but I cleaned up some of the code. Though it would have worked as is some of the things being checked were now unnecessary.

     

    using System;
    using System.Collections;
    using System.Collections.Generic;
    using System.Globalization;
    using System.Text;
    using System.Text.RegularExpressions;

    namespace CSSMinify
    {
        class CSSMinify
        {
            public static Hashtable shortColorNames = new Hashtable();
            public static Hashtable shortHexColors = new Hashtable();
            public static string Minify(string css)
            {
                return Minify(css, 0);
            }
            public static string Minify(string css, int columnWidth)
            {
                // BSD License http://developer.yahoo.net/yui/license.txt
                // New css tests and regexes by Michael Ash

                createHashTable();
                MatchEvaluator rgbDelegate = new MatchEvaluator(RGBMatchHandler);
                MatchEvaluator shortColorNameDelegate = new MatchEvaluator(ShortColorNameMatchHandler);
                MatchEvaluator shortColorHexDelegate = new MatchEvaluator(ShortColorHexMatchHandler);
                css = RemoveCommentBlocks(css);
                css = Regex.Replace(css, @"\s+", " "); //Normalize whitespace
                css = Regex.Replace(css, @"\x22\x5C\x22}\x5C\\x22\x22", "___PSEUDOCLASSBMH___"); //hide Box model hack
                /* Remove the spaces before the things that should not have spaces before them.
                   But, be careful not to turn "p :link {...}" into "p:link{...}"
                */
                css = Regex.Replace(css, @"(?#no preceding space needed)\s+((?:[!{};>+()\],])|(?<={[^{}]*):(?=[^}]*}))", "$1");
                css = Regex.Replace(css, @"([!{}:;>+([,])\s+", "$1");  // Remove the spaces after the things that should not have spaces after them.
                css = Regex.Replace(css, @"([^;}])}", "$1;}");    // Add the semicolon where it's missing.
                css = Regex.Replace(css, @"(\d+)\.0+(p(?:[xct])|(?:[cem])m|%|in|ex)\b", "$1$2"); // Remove .0 from size units x.0em becomes xem
                css = Regex.Replace(css, @"([\s:])(0)(px|em|%|in|cm|mm|pc|pt|ex)\b", "$1$2"); // Remove unit from zero
                //New test
                //Font weights
                css = Regex.Replace(css, @"(?<=font-weight:)normal\b", "400");
                css = Regex.Replace(css, @"(?<=font-weight:)bold\b", "700");
                //Thought this was a good idea but properties of a set not defined get element defaults. This is reseting them. css = ShortHandProperty(css);
                css = ShortHandAllProperties(css);
                //css = Regex.Replace(css, @":(\s*0){2,4}\s*;", ":0;"); // if all parameters zero just use 1 parameter
                // if all 4 parameters the same unit make 1 parameter
                css = Regex.Replace(css, @"(?<!background-position\s*):\s*(inherit|auto|0|(?:(?:\d*\.?\d+(?:p(?:[xct])|(?:[cem])m|%|in|ex))))(\s+\1){1,3};", ":$1;", RegexOptions.IgnoreCase);
                // if has 4 parameters and top unit = bottom unit and right unit = left unit make 2 parameters
                css = Regex.Replace(css, @":\s*((inherit|auto|0|(?:(?:\d*\.?\d+(?:p(?:[xct])|(?:[cem])m|%|in|ex))))\s+(inherit|auto|0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex)))))\s+\2\s+\3;", ":$1;", RegexOptions.IgnoreCase);
                // if has 4 parameters and top unit != bottom unit and right unit = left unit make 3 parameters
                css = Regex.Replace(css, @":\s*((?:(?:inherit|auto|0|(?:(?:\d*\.?\d+(?:p(?:[xct])|(?:[cem])m|%|in|ex))))\s+)?(inherit|auto|0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex))))\s+(?:0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex)))))\s+\2;", ":$1;", RegexOptions.IgnoreCase);
                //// if has 3 parameters and top unit = bottom unit make 2 parameters
                //css = Regex.Replace(css, @":\s*((0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex))))\s+(?:0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex)))))\s+\2;", ":$1;", RegexOptions.IgnoreCase);
                css = Regex.Replace(css, "background-position:0;", "background-position:0 0;");
                css = Regex.Replace(css, @"(:|\s)0+\.(\d+)", "$1.$2");
                //  Outline-styles and Border-sytles parameter reduction
                css = Regex.Replace(css, @"(outline|border)-style\s*:\s*(none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset)(?:\s+\2){1,3};", "$1-style:$2;", RegexOptions.IgnoreCase);

                css = Regex.Replace(css, @"(outline|border)-style\s*:\s*((none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset)\s+(none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset ))(?:\s+\3)(?:\s+\4);", "$1-style:$2;", RegexOptions.IgnoreCase);

                css = Regex.Replace(css, @"(outline|border)-style\s*:\s*((?:(?:none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset)\s+)?(none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset )\s+(?:none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset ))(?:\s+\3);", "$1-style:$2;", RegexOptions.IgnoreCase);

                css = Regex.Replace(css, @"(outline|border)-style\s*:\s*((none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset)\s+(?:none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset ))(?:\s+\3);", "$1-style:$2;", RegexOptions.IgnoreCase);

                //  Outline-color and Border-color parameter reduction
                css = Regex.Replace(css, @"(outline|border)-color\s*:\s*((?:\#(?:[0-9A-F]{3}){1,2})|\S+)(?:\s+\2){1,3};", "$1-color:$2;", RegexOptions.IgnoreCase);

                css = Regex.Replace(css, @"(outline|border)-color\s*:\s*(((?:\#(?:[0-9A-F]{3}){1,2})|\S+)\s+((?:\#(?:[0-9A-F]{3}){1,2})|\S+))(?:\s+\3)(?:\s+\4);", "$1-color:$2;", RegexOptions.IgnoreCase);

                css = Regex.Replace(css, @"(outline|border)-color\s*:\s*((?:(?:(?:\#(?:[0-9A-F]{3}){1,2})|\S+)\s+)?((?:\#(?:[0-9A-F]{3}){1,2})|\S+)\s+(?:(?:\#(?:[0-9A-F]{3}){1,2})|\S+))(?:\s+\3);", "$1-color:$2;", RegexOptions.IgnoreCase);

                // Shorten colors from rgb(51,102,153) to #336699
                // This makes it more likely that it'll get further compressed in the next step.
                css = Regex.Replace(css, @"rgb\s*\x28((?:25[0-5])|(?:2[0-4]\d)|(?:[01]?\d?\d))\s*,\s*((?:25[0-5])|(?:2[0-4]\d)|(?:[01]?\d?\d))\s*,\s*((?:25[0-5])|(?:2[0-4]\d)|(?:[01]?\d?\d))\s*\x29", rgbDelegate);
                css = Regex.Replace(css, @"(?<![\x22\x27=]\s*)\#(?:([0-9A-F])\1)(?:([0-9A-F])\2)(?:([0-9A-F])\3)", "#$1$2$3", RegexOptions.IgnoreCase);
                // Replace hex color code with named value is shorter
                css = Regex.Replace(css, @"(?<=color\s*:\s*.*)\#(?<hex>f00)\b", "red", RegexOptions.IgnoreCase);
                css = Regex.Replace(css, @"(?<=color\s*:\s*.*)\#(?<hex>[0-9a-f]{6})", shortColorNameDelegate, RegexOptions.IgnoreCase);
                css = Regex.Replace(css, @"(?<=color\s*:\s*)\b(Black|Fuchsia|LightSlateGr[ae]y|Magenta|White|Yellow)\b", shortColorHexDelegate, RegexOptions.IgnoreCase);

                // Remove empty rules.
                css = Regex.Replace(css, @"[^}]+{;}", "");
                //Remove semicolon of last property
                css = Regex.Replace(css, ";(})", "$1");
                if (columnWidth > 0)
                {
                    css = BreakLines(css, columnWidth);
                }
                return css;
            }
            private static string RemoveCommentBlocks(string input)
            {
                int startIndex = 0;
                int endIndex = 0;
                bool iemac = false;
                startIndex = input.IndexOf(@"/*", startIndex);
                while (startIndex >= 0)
                {
                    endIndex = input.IndexOf(@"*/", startIndex + 2);
                    if (endIndex >= startIndex + 2)
                    {
                        if (input[endIndex - 1] == '\\')
                        {
                            startIndex = endIndex + 2;
                            iemac = true;
                        }
                        else if (iemac)
                        {
                            startIndex = endIndex + 2;
                            iemac = false;
                        }
                        else
                        {
                            input = input.Remove(startIndex, endIndex + 2 - startIndex);
                        }
                    }
                    startIndex = input.IndexOf(@"/*", startIndex);
                }
                return input;
            }
            private static String RGBMatchHandler(Match m)
            {
                int val = 0;
                StringBuilder hexcolor = new StringBuilder("#");
                for (int index = 1; index <= 3; index += 1)
                {
                    val = Int32.Parse(m.Groups[index].Value);
                    hexcolor.Append(val.ToString("x2"));
                }
                return hexcolor.ToString();
            }
            private static string BreakLines(string css, int columnWidth)
            {
                int i = 0;
                int start = 0;
                StringBuilder sb = new StringBuilder(css);
                while (i < sb.Length)
                {
                    char c = sb[i++];
                    if (c == '}' && i - start > columnWidth)
                    {
                        sb.Insert(i, '\n');
                        start = i;
                    }
                }
                return sb.ToString();

            }
            private static string ReplaceNonEmpty(string inputText, string replacementText)
            {
                if (replacementText.Trim() != string.Empty)
                {
                    inputText = string.Format(" {0}", replacementText);
                }
                return inputText;
            }
            private static string ShortColorNameMatchHandler(Match m)
            {
                // This function replace hex color values named colors if the name is shorter than the hex code
                string returnValue = m.Value;
                if (shortColorNames.ContainsKey(m.Groups["hex"].Value))
                {
                    returnValue = shortColorNames[m.Groups["hex"].Value].ToString();
                }
                return returnValue;
            }
            private static string ShortColorHexMatchHandler(Match m)
            {
                //This function replaces named values with there shorter hex equivalent
                return shortHexColors[m.Value.ToString().ToLower()].ToString();
            }
            private static void createHashTable()
            {
                //Color names shorter than hex notation. Except for red.
                shortColorNames.Add("F0FFFF".ToLower(), "Azure".ToLower());
                shortColorNames.Add("F5F5DC".ToLower(), "Beige".ToLower());
                shortColorNames.Add("FFE4C4".ToLower(), "Bisque".ToLower());
                shortColorNames.Add("A52A2A".ToLower(), "Brown".ToLower());
                shortColorNames.Add("FF7F50".ToLower(), "Coral".ToLower());
                shortColorNames.Add("FFD700".ToLower(), "Gold".ToLower());
                shortColorNames.Add("808080".ToLower(), "Grey".ToLower());
                shortColorNames.Add("008000".ToLower(), "Green".ToLower());
                shortColorNames.Add("4B0082".ToLower(), "Indigo".ToLower());
                shortColorNames.Add("FFFFF0".ToLower(), "Ivory".ToLower());
                shortColorNames.Add("F0E68C".ToLower(), "Khaki".ToLower());
                shortColorNames.Add("FAF0E6".ToLower(), "Linen".ToLower());
                shortColorNames.Add("800000".ToLower(), "Maroon".ToLower());
                shortColorNames.Add("000080".ToLower(), "Navy".ToLower());
                shortColorNames.Add("808000".ToLower(), "Olive".ToLower());
                shortColorNames.Add("FFA500".ToLower(), "Orange".ToLower());
                shortColorNames.Add("DA70D6".ToLower(), "Orchid".ToLower());
                shortColorNames.Add("CD853F".ToLower(), "Peru".ToLower());
                shortColorNames.Add("FFC0CB".ToLower(), "Pink".ToLower());
                shortColorNames.Add("DDA0DD".ToLower(), "Plum".ToLower());
                shortColorNames.Add("800080".ToLower(), "Purple".ToLower());
                shortColorNames.Add("FA8072".ToLower(), "Salmon".ToLower());
                shortColorNames.Add("A0522D".ToLower(), "Sienna".ToLower());
                shortColorNames.Add("C0C0C0".ToLower(), "Silver".ToLower());
                shortColorNames.Add("FFFAFA".ToLower(), "Snow".ToLower());
                shortColorNames.Add("D2B48C".ToLower(), "Tan".ToLower());
                shortColorNames.Add("008080".ToLower(), "Teal".ToLower());
                shortColorNames.Add("FF6347".ToLower(), "Tomato".ToLower());
                shortColorNames.Add("EE82EE".ToLower(), "Violet".ToLower());
                shortColorNames.Add("F5DEB3".ToLower(), "Wheat".ToLower());

                // Hex notation shorter than named value
                shortHexColors.Add("black", "#000");
                shortHexColors.Add("fuchsia", "#f0f");
                shortHexColors.Add("lightSlategray", "#789");
                shortHexColors.Add("lightSlategrey", "#789");
                shortHexColors.Add("magenta", "#f0f");
                shortHexColors.Add("white", "#fff");
                shortHexColors.Add("yellow", "#ff0");
            }
            private static string ShortHandAllProperties(string css)
            {
                /*
                 * This function searchs for properties specifying all the individual properties of a property type
                 * and reduces it to a single property use shorthand notation
                 */
                Regex reCSSBlock = new Regex("{[^{}]*}");
                Regex reTRBL1 = new Regex(@"(?<fullProperty>(?:(?<property>padding)-(?<position>top|right|bottom|left)))\s*:\s*(?<unit>[\w.]+);?", RegexOptions.IgnoreCase);
                Regex reTRBL2 = new Regex(@"(?<fullProperty>(?:(?<property>margin)-(?<position>top|right|bottom|left)))\s*:\s*(?<unit>[\w.]+);?", RegexOptions.IgnoreCase);
                Regex reTRBL3 = new Regex(@"(?<fullProperty>(?<property>border)-(?<position>top|right|bottom|left)(?<property2>-(?:color)))\s*:\s*(?<unit>[#\w.]+);?", RegexOptions.IgnoreCase);
                Regex reTRBL4 = new Regex(@"(?<fullProperty>(?<property>border)-(?<position>top|right|bottom|left)(?<property2>-(?:style)))\s*:\s*(?<unit>none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset);?", RegexOptions.IgnoreCase);
                Regex reTRBL5 = new Regex(@"(?<fullProperty>(?<property>border)-(?<position>top|right|bottom|left)(?<property2>-(?:width)))\s*:\s*(?<unit>[\w.]+);?", RegexOptions.IgnoreCase);
                Regex reListStyle = new Regex(@'list-style-(?<style>type|image|position)\s*:\s*(?<unit>[^};]+);?', RegexOptions.IgnoreCase);
                Regex reFont = new Regex(@"font-(?:(?:(?<fontProperty>family\b)\s*:\s*(?<fontPropertyValue>(?:\b[a-zA-Z]+(-[a-zA-Z]+)?\b|\x22[^\x22]+\x22)(?:\s*,\s*(?:\b[a-zA-Z]+(-[a-zA-Z]+)?\b|\x22[^\x22]+\x22))*)\b)|
    (?:(?<fontProperty>style\b)\s*:\s*(?<fontPropertyValue>normal|italic|oblique|inherit))|
    (?:(?<fontProperty>variant\b)\s*:\s*(?<fontPropertyValue>normal|small-caps|inherit))|
    (?:(?<fontProperty>weight\b)\s*:\s*(?<fontPropertyValue>normal|bold|(?:bold|light)er|[1-9]00|inherit))|
    (?:(?<fontProperty>size\b)\s*:\s*(?<fontPropertyValue>(?:(?:xx?-)?(?:small|large))|medium|(?:\d*\.?\d+(?:%|(p(?:[xct])|(?:[cem])m|in|ex))\b)|inherit|\b0\b)))\s*;?", (RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace));
                Regex reBackGround = new Regex(@"background-(?:
    (?:(?<property>color)\s*:\s*(?<unit>transparent|inherit|(?:(?:\#(?:[0-9A-F]{3}){1,2})|\S+)))|
    (?:(?<property>image)\s*:\s*(?<unit>none|inherit|(?:url\s*\([^()]+\))))|
    (?:(?<property>repeat)\s*:\s*(?<unit>no-repeat|inherit|repeat(?:-[xy])))|
    (?:(?<property>attachment)\s*:\s*(?<unit>scroll|inherit|fixed))|
    (?:(?<property>position)\s*:\s*(?<unit>((?<horizontal>left | center | right|(?:0|(?:(?:\d*\.?\d+(?:p(?:[xct])|(?:[cem])m|%|in|ex)))))\s+(?<vertical>top | center | bottom |(?:0|(?:(?:\d*\.?\d+(?:p(?:[xct])|(?:[cem])m|%|in|ex))))))|
        ((?<vertical>top | center | bottom )\s+(?<horizontal>left | center | right ))|
        ((?<horizontal>left | center | right )|(?<vertical>top | center | bottom ))))
    );?", (RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace | RegexOptions.ExplicitCapture));
                MatchCollection mcBlocks = reCSSBlock.Matches(css);
                foreach (Match mBlock in mcBlocks)
                {
                    string strBlock = mBlock.Value;
                    HasAllPositions(reTRBL1, ref strBlock);
                    HasAllPositions(reTRBL2, ref strBlock);
                    HasAllPositions(reTRBL3, ref strBlock);
                    HasAllPositions(reTRBL4, ref strBlock);
                    HasAllPositions(reTRBL5, ref strBlock);
                    HasAllListStyle(reListStyle, ref strBlock);
                    HasAllFontProperties(reFont, ref strBlock);
                    HasAllBackGroundProperties(reBackGround, ref strBlock);
                    css = css.Replace(mBlock.Value, strBlock);
                }
                return css;
            }
            private static void HasAllBackGroundProperties(Regex re, ref string CSSText)
            {
                {
                    MatchCollection mcProperySet = re.Matches(CSSText);
                    int z = 5;
                    if (mcProperySet.Count == z)
                    {

                        int y = 0;
                        for (int x = 0; x < z; x = x + 1)
                        {
                            switch (mcProperySet[x].Groups["property"].Value)
                            {
                                case "color":
                                    y = y + 1;
                                    break;
                                case "image":
                                    y = y + 2;
                                    break;
                                case "repeat":
                                    y = y + 4;
                                    break;
                                case "attachment":
                                    y = y + 8;
                                    break;
                                case "position":
                                    y = y + 16;
                                    break;
                            }
                        }
                        if (y == 31)
                        {
                            CSSText = ShortHandBackGroundReplaceV2(mcProperySet, re, CSSText);
                        }
                    }
                }
            }
            private static void HasAllFontProperties(Regex re, ref string CSSText)
            {
                {
                    MatchCollection mcProperySet = re.Matches(CSSText);
                    int z = 5;
                    if (mcProperySet.Count == z)
                    {

                        int y = 0;
                        for (int x = 0; x < z; x = x + 1)
                        {
                            switch (mcProperySet[x].Groups["fontProperty"].Value)
                            {
                                case "style":
                                    y = y + 1;
                                    break;
                                case "variant":
                                    y = y + 2;
                                    break;
                                case "weight":
                                    y = y + 4;
                                    break;
                                case "size":
                                    y = y + 8;
                                    break;
                                case "family":
                                    y = y + 16;
                                    break;
                            }
                        }
                        if (y == 31)
                        {
                            CSSText = ShortHandFontReplaceV2(mcProperySet, re, CSSText);
                        }
                    }
                }
            }
            private static void HasAllListStyle(Regex re, ref string CSSText)
            {
                {
                    int z = 3;
                    MatchCollection mcProperySet = re.Matches(CSSText);
                    if (mcProperySet.Count == z)
                    {

                        int y = 0;
                        for (int x = 0; x < z; x = x + 1)
                        {
                            switch (mcProperySet[x].Groups["style"].Value)
                            {
                                case "type":
                                    y = y + 1;
                                    break;
                                case "image":
                                    y = y + 2;
                                    break;
                                case "position":
                                    y = y + 4;
                                    break;

                            }
                        }
                        if (y == 7)
                        {
                            CSSText = ShortHandListReplaceV2(mcProperySet, re, CSSText);
                        }
                    }
                }
            }
            private static void HasAllPositions(Regex re, ref string CSSText)
            {
                {
                    MatchCollection mcProperySet = re.Matches(CSSText);
                    if (mcProperySet.Count == 4)
                    {

                        int y = 0;
                        for (int x = 0; x < 4; x = x + 1)
                        {
                            switch (mcProperySet[x].Groups["position"].Value)
                            {
                                case "top":
                                    y = y + 1;
                                    break;
                                case "right":
                                    y = y + 2;
                                    break;
                                case "bottom":
                                    y = y + 4;
                                    break;
                                case "left":
                                    y = y + 8;
                                    break;
                            }
                        }
                        if (y == 15)
                        {
                            CSSText = ShortHandReplaceV2(mcProperySet, re, CSSText);
                        }
                    }
                }
            }
            private static string ShortHandFontReplaceV2(MatchCollection mcProperySet, Regex re, string InputText)
            {
                /*
                 * This Function replaces the individual font properties with a single entry
                 * */
                string strFamily, strStyle, strVariant, strWeight, strSize;
                Regex reLineHeight = new Regex(@"line-height\s*:\s*((?:\d*\.?\d+(?:%|(p(?:[xct])|(?:[cem])m|in|ex)\b)?)|normal|inherit);?", RegexOptions.IgnoreCase);
                strFamily = string.Empty;
                strStyle = string.Empty;
                strVariant = string.Empty;
                strWeight = string.Empty;
                strSize = string.Empty;
                string strStyle_Variant_Weight = string.Empty;
                foreach (Match mProperty in mcProperySet)
                {
                    switch (mProperty.Groups[""].Value)
                    {
                        case "family":
                            strFamily = string.Format(" {0}", mProperty.Groups["fontPropertyValue"].Value);
                            break;
                        case "size":
                            if (reLineHeight.IsMatch(InputText))
                            {
                                Match m = reLineHeight.Match(InputText);
                                if (m.Groups[1].Value != "normal")
                                {
                                    strSize = String.Format("/{0}", m.Groups[1].Value);
                                }
                                InputText = reLineHeight.Replace(InputText, string.Empty);
                            }
                            strSize = string.Format(" {0}{1}", mProperty.Groups["fontPropertyValue"].Value, strSize);
                            if (strSize == "medium")
                            {
                                strSize = string.Empty;
                            }
                            break;
                        case "style":
                        case "variant":
                        case "weight":
                            if (mProperty.Groups["fontPropertyValue"].Value != "normal")
                            {
                                strStyle_Variant_Weight += string.Format(" {0}", mProperty.Groups["fontPropertyValue"].Value);
                            } break;

                    }
                }

                string strShortcut;
                string strProperties = string.Format("{0}{1}{2};", strStyle_Variant_Weight, strVariant, strWeight, strSize, strFamily);
                strShortcut = string.Format("font:{0}", strProperties.Trim());
                string strNewBlock = re.Replace(InputText, "");
                strNewBlock = strNewBlock.Insert(1, strShortcut);
                return strNewBlock;
            }
            private static string ShortHandBackGroundReplaceV2(MatchCollection mcProperySet, Regex re, string InputText)
            {
                /*
                 * This Function replaces the individual background properties with a single entry
                 * */
                string strColor, strImage, strRepeat, strAttachment, strPosition;
                strColor = string.Empty;
                strImage = string.Empty;
                strRepeat = string.Empty;
                strAttachment = string.Empty;
                strPosition = string.Empty;
                foreach (Match mProperty in mcProperySet)
                {
                    switch (mProperty.Groups["property"].Value)
                    {
                        case "color":
                            if (mProperty.Groups["unit"].Value != "transparent")
                            {
                                strColor = string.Format(" {0}", mProperty.Groups["unit"].Value);
                            }
                            break;
                        case "image":
                            if (mProperty.Groups["unit"].Value != "none")
                            {
                                strImage = string.Format(" {0}", mProperty.Groups["unit"].Value);
                            }
                            break;
                        case "repeat":
                            if (mProperty.Groups["unit"].Value != "repeat")
                            {
                                strRepeat = string.Format(" {0}", mProperty.Groups["unit"].Value);
                            } break;
                        case "attachment":
                            if (mProperty.Groups["unit"].Value != "scroll")
                            {
                                strAttachment = string.Format(" {0}", mProperty.Groups["unit"].Value);
                            }
                            break;
                        case "position":
                            if (mProperty.Groups["unit"].Value != "0% 0%")
                            {
                                strPosition = string.Format(" {0}", mProperty.Groups["unit"].Value);
                            }
                            break;
                    }
                }

                string strShortcut;
                string strProperties = string.Format("{0}{1}{2}{3}{4};", strColor, strImage, strRepeat, strAttachment, strPosition);
                strShortcut = string.Format("background:{0}", strProperties.Trim());
                string strNewBlock = re.Replace(InputText, "");
                strNewBlock = strNewBlock.Insert(1, strShortcut);
                return strNewBlock;
            }
            private static string ShortHandReplaceV2(MatchCollection mcProperySet, Regex reTRBL1, string InputText)
            {
                // Replace method for regexes used in ShortHand property method for properties with top, right, bottom and left sub properties.
                string strTop, strRight, strBottom, strLeft;
                strTop = string.Empty;
                strRight = string.Empty;
                strBottom = string.Empty;
                strLeft = string.Empty;
                string strProperty;
                strProperty = string.Format("{0}{1}", mcProperySet[0].Groups["property"].Value, mcProperySet[0].Groups["property2"].Value);
                foreach (Match mProperty in mcProperySet)
                {
                    switch (mProperty.Groups["position"].Value)
                    {
                        case "top":
                            strTop = mProperty.Groups["unit"].Value;
                            break;
                        case "right":
                            strRight = mProperty.Groups["unit"].Value;
                            break;
                        case "bottom":
                            strBottom = mProperty.Groups["unit"].Value;
                            break;
                        case "left":
                            strLeft = mProperty.Groups["unit"].Value;
                            break;
                    }

                }

                string strShortcut = string.Format("{0}:{1} {2} {3} {4};", strProperty, strTop, strRight, strBottom, strLeft);
                string strNewBlock = reTRBL1.Replace(InputText, "");
                strNewBlock = strNewBlock.Insert(1, strShortcut);
                return strNewBlock;
            }
            private static string ShortHandListReplaceV2(MatchCollection mcProperySet, Regex re, string InputText)
            {
                /*
                 * This Function replaces the individual list properties with a single entry
                 * */
                string strType, strPosition, strImage;
                strType = string.Empty;
                strPosition = string.Empty;
                strImage = string.Empty;
                foreach (Match mProperty in mcProperySet)
                {
                    switch (mProperty.Groups["style"].Value)
                    {
                        case "type":
                            if (mProperty.Groups["unit"].Value != "disc")
                            {
                                strType = mProperty.Groups["unit"].Value;
                            }
                            break;
                        case "position":
                            if (mProperty.Groups["unit"].Value != "outside")
                            {
                                strPosition = string.Format(" {0}", mProperty.Groups["unit"].Value);
                            }
                            break;
                        case "style":
                            if (mProperty.Groups["unit"].Value != "none")
                            {
                                strImage = string.Format(" {0}", mProperty.Groups["unit"].Value);
                            }
                            break;
                    }

                }

                string strShortcut = string.Format("list-style:{0}{1}{2};", strType, strPosition, strImage);
                string strNewBlock = re.Replace(InputText, "");
                strNewBlock = strNewBlock.Insert(1, strShortcut);
                return strNewBlock;
            }
        }
    }
     

    Sponsor
  • Follow up to Additional CSS minifying regex patterns

    OK, there regexes were discussed in the previous post this is mostly just their application.

    This is a C# 2.0 enhancement of a C# port of YUI Compressor's  CSS minification code

     Since I was doing this is C# I took full advantage of it's regex engine, namely using lookbehinds and delegates for some replaces.

    Almost all the regexes after the "New Test" comment are the new or modified regexes from the ported version. There is also one new and two modified expressions before that comment. One of those modification is just a change in writing style, the other modifications are replacing some code but (hopefully) not functionality with a regex replace. The new regex replacements of course are the new compression enhancements.

    There are also a couple of new regexes not mentioned in the previous post that match and replace some of the color values with an equivalent but a more concisely written value. The replace the color "red" is a straight replace but the other colors require some code evaluation and are using delegates.

    I've done some very limited testing but as I mentioned in the previous post most of the CSS I've written doesn't have some of the new things I was searching for. I could add them for a test (which I did) but that won't catch any problems they my cause to the actual CSS application since I wasn't really using the test values. So the source code is now available for beta testing.  Test early and often before committing to use it.  I'm willing to fix any minor bugs for things I may have overlook but if a particular replace is problematic it's easy enough to comment out the offender and use the rest.

    And as was mentioned in the comments of the previous post any generated content that looks like CSS may get stepped on so be aware of that.

    And also that all licenses for previous versions still apply.

    UPDATE 2008-04-27

    After a little more testing I discovered one of the replaces I was doing can alter how the CSS is processed.  So I have just crossed out the  functions and function call
    I've come up with a safer, though less likely to occur replacement.

    using System;
    using System.Collections;
    using System.Collections.Generic;
    using System.Globalization;
    using System.Text;
    using System.Text.RegularExpressions;
     namespace CSSMinify
    {
     class CSSMinify
     {
       public static Hashtable shortColorNames = new Hashtable();
       public static Hashtable shortHexColors = new Hashtable();
       public static string Minify(string css)
       {
         return Minify(css, 0);
       }
       public static string Minify(string css, int columnWidth)
       {
       // BSD License http://developer.yahoo.net/yui/license.txt
       // New css tests and regexes by Michael Ash
         createHashTable();
         MatchEvaluator rgbDelegate = new MatchEvaluator(RGBMatchHandler);
         MatchEvaluator shortColorNameDelegate = new     MatchEvaluator(ShortColorNameMatchHandler);
         MatchEvaluator shortColorHexDelegate = new MatchEvaluator(ShortColorHexMatchHandler);
         css = RemoveCommentBlocks(css);
         css = Regex.Replace(css, @"\s+", " "); //Normalize whitespace
         css = Regex.Replace(css, @"\x22\x5C\x22}\x5C\x22\x22", "___PSEUDOCLASSBMH___"); //hide Box model hack
         /* Remove the spaces before the things that should not have spaces before           them.
             But, be careful not to turn "p :link {...}" into "p:link{...}"
         */
         css = Regex.Replace(css, @"(?#no preceding space needed)\s+((?:[!{};>+()\],])|(?<={[^{}]*):(?=[^}]*}))", "$1");
         css = Regex.Replace(css, @"([!{}:;>+([,])\s+", "$1"); // Remove the spaces after the things that should not have spaces after them.
         css = Regex.Replace(css, @"([^;}])}", "$1;}"); // Add the semicolon where it's missing.
         css = Regex.Replace(css, @"(\d+)\.0+(p(?:[xct])|(?:[cem])m|%|in|ex)\b", "$1$2"); // Remove .0 from size units x.0em becomes xem
         css = Regex.Replace(css, @"([\s:])(0)(px|em|%|in|cm|mm|pc|pt|ex)\b", "$1$2"); // Remove unit from zero
         //New test
         css = ShortHandProperty(css);
         //css = Regex.Replace(css, @":(\s*0){2,4}\s*;", ":0;"); // if all parameters zero just use 1 parameter
         // if all 4 parameters the same unit make 1 parameter
         css = Regex.Replace(css, @":\s*(0|(?:(?:\d*\.?\d+(?:p(?:[xct])|(?:[cem])m|%|in|ex))))(\s+\1){1,3};", ":$1;", RegexOptions.IgnoreCase);
         // if has 4 parameters and top unit = bottom unit and right unit = left unit make 2 parameters
         css = Regex.Replace(css, @":\s*((0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex))))\s+(0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex)))))\s+\2\s+\3;", ":$1;", RegexOptions.IgnoreCase);
         // if has 4 parameters and top unit != bottom unit and right unit = left unit make 3 parameters
         css = Regex.Replace(css, @":\s*((?:(?:0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex))))\s+)?(0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex))))\s+(?:0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex)))))\s+\2;", ":$1;", RegexOptions.IgnoreCase);
         //// if has 3 parameters and top unit = bottom unit make 2 parameters
         //css = Regex.Replace(css, @":\s*((0|(?:(?:\d?\.?\d(?:p(?:[xct])|    (?:[cem])m|%|in|ex))))\s+(?:0|(?:(?:\d?\.?\d(?:p(?:[xct])|(?:[cem])m|%|in|ex)))))\s+\2;", ":$1;", RegexOptions.IgnoreCase);
         css = Regex.Replace(css,"background-position:0;", "background-position:0 0;");
         css = Regex.Replace(css,@"(:|\s)0+\.(\d+)", "$1.$2");
       // Outline-styles and Border-sytles parameter reduction
         css = Regex.Replace(css, @"(outline|border)-style\s*:\s*(none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset)(?:\s+\2){1,3};", "$1-style:$2;", RegexOptions.IgnoreCase);
         css = Regex.Replace(css, @"(outline|border)-style\s*:\s*((none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset)\s+(none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset ))(?:\s+\3)(?:\s+\4);", "$1-style:$2;", RegexOptions.IgnoreCase);
         css = Regex.Replace(css, @"(outline|border)-style\s*:\s*((?:(?:none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset)\s+)?(none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset )\s+(?:none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset ))(?:\s+\3);", "$1-style:$2;", RegexOptions.IgnoreCase);
         css = Regex.Replace(css, @"(outline|border)-style\s*:\s*((none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset)\s+(?:none|hidden|d(?:otted|ashed|ouble)|solid|groove|ridge|inset|outset ))(?:\s+\3);", "$1-style:$2;", RegexOptions.IgnoreCase);
         // Outline-color and Border-color parameter reduction
         css = Regex.Replace(css, @"(outline|border)-color\s*:\s*((?:\#(?:[0-9A-F]{3}){1,2})|\S+)(?:\s+\2){1,3};", "$1-color:$2;", RegexOptions.IgnoreCase);
         css = Regex.Replace(css, @"(outline|border)-color\s*:\s*(((?:\#(?:[0-9A-F]{3}){1,2})|\S+)\s+((?:\#(?:[0-9A-F]{3}){1,2})|\S+))(?:\s+\3)(?:\s+\4);", "$1-color:$2;", RegexOptions.IgnoreCase);
         css = Regex.Replace(css, @"(outline|border)-color\s*:\s*((?:(?:(?:\#(?:[0-9A-F]{3}){1,2})|\S+)\s+)?((?:\#(?:[0-9A-F]{3}){1,2})|\S+)\s+(?:(?:\#(?:[0-9A-F]{3}){1,2})|\S+))(?:\s+\3);", "$1-color:$2;", RegexOptions.IgnoreCase);
     // Shorten colors from rgb(51,102,153) to #336699
     // This makes it more likely that it'll get further compressed in the next step.
         css = Regex.Replace(css,@"rgb\s*\x28((?:25[0-5])|(?:2[0-4]\d)|(?:[01]?\d?\d))\s*,\s*((?:25[0-5])|(?:2[0-4]\d)|(?:[01]?\d?\d))\s*,\s*((?:25[0-5])|(?:2[0-4]\d)|(?:[01]?\d?\d))\s*\x29", rgbDelegate);
         css = Regex.Replace(css, @"(?<![\x22\x27=]\s*)\#(?:([0-9A-F])\1)(?:([0-9A-F])\2)(?:([0-9A-F])\3)", "#$1$2$3", RegexOptions.IgnoreCase);
     // Replace hex color code with named value is shorter
         css = Regex.Replace(css, @"(?<=color\s*:\s*.*)\#(?<hex>f00)\b", "red",RegexOptions.IgnoreCase);
         css = Regex.Replace(css, @"(?<=color\s*:\s*.*)\#(?<hex>[0-9a-f]{6})", shortColorNameDelegate, RegexOptions.IgnoreCase);
         css = Regex.Replace(css, @"(?<=color\s*:\s*)\b(Black|Fuchsia|LightSlateGr[ae]y|Magenta|White|Yellow)\b", shortColorHexDelegate,RegexOptions.IgnoreCase);
         // Remove empty rules.
         css = Regex.Replace(css,@"[^}]+{;}", "");
         //Remove semicolon of last property
         css = Regex.Replace(css, ";(})", "$1");
         if (columnWidth > 0)
         {
           css = BreakLines(css, columnWidth);
         }
         return css;
     }
     private static string RemoveCommentBlocks(string input)
     {
       int startIndex = 0;
       int endIndex = 0;
       bool iemac = false;
       startIndex = input.IndexOf(@"/*", startIndex);
       while (startIndex >= 0)
       {
         endIndex = input.IndexOf(@"*/", startIndex + 2);
         if (endIndex >= startIndex + 2)
         {
           if (input[endIndex - 1] == '\\')
           {
             startIndex = endIndex + 2;
             iemac = true;
           }
       &n