Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Stripping Logging Statements in JavaScript

Last post 08-15-2008, 3:55 PM by seanoshea. 10 replies.
Sort Posts: Previous Next
  •  08-14-2008, 5:11 PM 45359

    Stripping Logging Statements in JavaScript

    Hi there,

    I'm using Perl regular expressions to try to find logging statements in my compressed JavaScript code and strip them out of my code. Here's what my regular expression looks like at the moment:

             foreach my $line (@file_lines) {
                # strip the line of console.log statements
                $line =~ s/console\.[warn|log|error|con|info]+\((.*?)\)+;+/;/gi;
                print DATAOUT "$line";
            }

    Basically, I want to strip all console.log|warn|etc statements from my code and just replace them with a semi-colon. Here's a sample of the text I need to strip:

    console.warn("_onValidateFailure: ",_4);class.common.hide(this.loadingNode);class.common.enable(this.submitButton);this.createInFlight=false;topic.publish("/class/error",[{alignNode:this.,errorType:"_validate",title:this._prompts.unableVal,detail:_4.message}]);},_createAccount:function(){topic.publish("/class/error/close",["all"]);if(this._executeValidations()){app.apiService.required["Login"]=this.email1.value;var _5=app..getInsertedData();console.log("*** CREATING ACCOUNT WITH "+this.email1.value+" "+this.password1.value+" "+this.firstName.value+" "+this.lastName.value+" "+_5.id+" "+_5.RedKey);var _6=app.apiService.createAccount({Login:this.email1.value,Password:this.password1.value,FirstName:this.firstName.value,LastName:this.lastName.value,:_5.id,RegKey:_5.regKey,Add:1});_6.addCallback(this,"_onCreateAccount");_6.addErrback(this,"_onCreateAccountFailure");}},_onCreateAccount:function(_7){class.common.hide(this.loadingNode);this.createInFlight=false;topic.publish("/class/error/close",["all"]);topic.publish("/class/createAccount",[_7]);},_onCreateAccountFailure:function(_8){console.warn("_onCreateAccountFailure: ",_8);

    This regular expression has been working for me save for two situations.

    1.  This following is valid BLOCKED SCRIPT

    console.log("This is a ) logging statement;")

    Note the lack of a semi-colon at the end of this statement and the fact that a right-parenthesis is a viable as part of a String parameter to the console.log statement. I've tried altering my regular expression to look like this:

             foreach my $line (@file_lines) {
                # strip the line of console.log statements
                $line =~ s/console\.[warn|log|error|con|info]+\((.*?)\)+;?/;/gi;
                print DATAOUT "$line";
            }

    but that doesnt result in a favorable solution as it leaves 'logging statement;")' right in the middle of my output.

     2. The following is also valid BLOCKED SCRIPT

    (1 == 0)?console.log("What?????"):console.log("OK");

    However, I cant really strip these console statements as a tri-conditional statement in JavaScript has to have more substance than just (i == 0) ? ; : ;. So, for the most part, I just want to ignore these console.log statements which are trapped in tri-conditional blocks. I've tried altering my regular expression to look like this:

             foreach my $line (@file_lines) {
                # strip the line of console.log statements
                $line =~ s/[^\?:]console\.[warn|log|error|con|info]+\((.*?)\)+;?/;/gi;
                print DATAOUT "$line";
            }

    I'm attempting to ignore any console.log statements which are prefixed with '?' or ':', but I cant seem to get that syntax correct either.

    Any help is greatly appreciated

    Sean

     

     

  •  08-14-2008, 7:17 PM 45360 in reply to 45359

    Re: Stripping Logging Statements in JavaScript

    In your sample text please show exactly what should be removed (strikethrough it or underline it etc.).
  •  08-14-2008, 7:29 PM 45362 in reply to 45360

    Re: Stripping Logging Statements in JavaScript

    Hi there ddrudik,

    Thanks for your quick reply. To clarify - my input text looks like this:

     console.warn("_onValidateFailure: ",_4);class.common.hide(this.loadingNode);class.common.enable(this.submitButton);this.createInFlight=false;topic.publish("/class/error",[{alignNode:this.,errorType:"_validate",title:this._prompts.unableVal,detail:_4.message}]);},_createAccount:function(){topic.publish("/class/error/close",["all"]);if(this._executeValidations()){app.apiService.required["Login"]=this.email1.value;var _5=app..getInsertedData();console.log("*** CREATING ACCOUNT WITH "+this.email1.value+" "+this.password1.value+" "+this.firstName.value+" "+this.lastName.value+" "+_5.id+" "+_5.RedKey);var _6=app.apiService.createAccount({Login:this.email1.value,Password:this.password1.value,FirstName:this.firstName.value,LastName:this.lastName.value,:_5.id,RegKey:_5.regKey,Add:1});_6.addCallback(this,"_onCreateAccount");_6.addErrback(this,"_onCreateAccountFailure");}},_onCreateAccount:function(_7){class.common.hide(this.loadingNode);this.createInFlight=false;topic.publish("/class/error/close",["all"]);topic.publish("/class/createAccount",[_7]);},_onCreateAccountFailure:function(_8){console.warn("_onCreateAccountFailure: ",_8);console.log("Here's a logging statement) without a semicolo;n at the end")1==0?console.warn("this should be left alone"):console.log("as should this");

    I'd like my output text to look like this:

     console.warn("_onValidateFailure: ",_4);class.common.hide(this.loadingNode);class.common.enable(this.submitButton);this.createInFlight=false;topic.publish("/class/error",[{alignNode:this.,errorType:"_validate",title:this._prompts.unableVal,detail:_4.message}]);},_createAccount:function(){topic.publish("/class/error/close",["all"]);if(this._executeValidations()){app.apiService.required["Login"]=this.email1.value;var _5=app..getInsertedData();console.log("*** CREATING ACCOUNT WITH "+this.email1.value+" "+this.password1.value+" "+this.firstName.value+" "+this.lastName.value+" "+_5.id+" "+_5.RedKey);var _6=app.apiService.createAccount({Login:this.email1.value,Password:this.password1.value,FirstName:this.firstName.value,LastName:this.lastName.value,:_5.id,RegKey:_5.regKey,Add:1});_6.addCallback(this,"_onCreateAccount");_6.addErrback(this,"_onCreateAccountFailure");}},_onCreateAccount:function(_7){class.common.hide(this.loadingNode);this.createInFlight=false;topic.publish("/class/error/close",["all"]);topic.publish("/class/createAccount",[_7]);},_onCreateAccountFailure:function(_8){console.warn("_onCreateAccountFailure: ",_8);console.log("Here's a logging statement) without a semicolo;n at the end")1==0?console.warn("this should be left alone"):console.log("as should this");

    Anything you see in the output above which is highlighted with a strikethrough should be replaced by a semi-colon. Note the two console.log statements at the end of the input and output. They display the two questions I had outlined in my first post - in both cases, I dont want to replace these console.log statements with semi-colons.

    Thanks again for your help

    Sean 

     

  •  08-14-2008, 7:52 PM 45363 in reply to 45362

    Re: Stripping Logging Statements in JavaScript

    As your pattern try:

    (?<![?:])console\.(warn|log|error|con|info)\(("[^"]*"|[^)])*\);

    with the same options as you are using.

    Note that you have been using a character class wrongly in your test patterns: "[warn|log|error|con|info]" is a character class that will match any of the characters specified within it, in this case including the '|' character. When you added the '+' quantifier, it will match what you are after but it will also match text such as "rog|||iw". I have replaced the square brackets with parentheses which are the construct you are looking for.

    I'm assuming that any occurrance of "console" that is preceded by either a colon or a question mark are to be excluded - I hope this is sufficient to detect the tri-conditional case.

    Susan

  •  08-14-2008, 8:52 PM 45366 in reply to 45363

    Re: Stripping Logging Statements in JavaScript

    Hi Susan,

    Thanks for your reply. I tried what you suggested, but it seems to alter my JavaScript in ways which causes syntax errors.

    Here's what I tried passing through my perl code:

    $line =~ s/(?<![?:])console\.(warn|log|error|con|info)\(("[^"]*"|[^)])*\);/;/gi;

    Just stepping through each piece of the regular expression:

    (?<![?:]) - is this basically saying not to match any console.log statements which start with : or ?. What are the ?<! characters doing here? Is that a different way of representing the NOT operator? I thought ^ was the way to represent NOT. I tried ([^?:]) at the start of my expression, but that doesnt seem to work either. Also, is the ? at the start of this section of the regular expression stating that this section is optional?

    console\.(warn|log|error|con|info)\( - match anything starting with console.log(, console.warn( etc ... I didnt realize I was misusing the [] syntax in this section of my regular expression. Thanks for pointing that out.

    ("[^"]*"|[^)])*\); - Any sequence of letter which start with a " and doesnt have " or ) in the middle of them and ends with a semi-colon? I have to admit, I'm a little lost with this section of the regular expression. Can you explain this for me ...

    Thanks again

    Sean

     

  •  08-14-2008, 11:11 PM 45368 in reply to 45366

    Re: Stripping Logging Statements in JavaScript

    I get these matches, to exclude the 4th match at the end I required that the console statement not be prefaced by a colon, although you would have to tell me if that's proper assumption based on your source:

    Raw Match Pattern:
    (?<!:)console\.(?:warn|log|error|con|info)(\((?:(?>[^()]+)|(?1))*\));

    $matches Array:
    (
        [0] => Array
            (
                [0] => console.warn("_onValidateFailure: ",_4);
                [1] => console.log("*** CREATING ACCOUNT WITH "+this.email1.value+" "+this.password1.value+" "+this.firstName.value+" "+this.lastName.value+" "+_5.id+" "+_5.RedKey);
                [2] => console.warn("_onCreateAccountFailure: ",_8);
            )

        [1] => Array
            (
                [0] => ("_onValidateFailure: ",_4)
                [1] => ("*** CREATING ACCOUNT WITH "+this.email1.value+" "+this.password1.value+" "+this.firstName.value+" "+this.lastName.value+" "+_5.id+" "+_5.RedKey)
                [2] => ("_onCreateAccountFailure: ",_8)
            )

    )


  •  08-15-2008, 12:37 AM 45369 in reply to 45368

    Re: Stripping Logging Statements in JavaScript

    Hi there ddrudik,

    Thanks for taking the time to look at my issue. I tried plugging your suggestion into my build scripts, but I got this error when I tried to run it:

    sean-o-sheas-macbook-pro:dojo seanoshea$ perl build.pl
    Sequence (?1...) not recognized in regex; marked by <-- HERE in m/(?<!:)console\.(?:warn|log|error|con|info)(\((?:(?>[^()]+)|(?1 <-- HERE ))*\));/ at build.pl line 83.

    I'm guessing changing to it to this is ok:

    (?<!:)console\.(?:warn|log|error|con|info)(\((?:(?>[^()]+)|(?{1}))*\));

    I ran the regular expression against my code and everything seems to run well. With respect to your question on the colon, I think you've got it spot on with your regular expression. Basically, the regular expression should not match against code like this:

    1==0?console.warn("this should be left alone"):console.log("as should this");

    That is, any console.* statements preceded by a '?' or a ':' should not be matched against.

     

    One thing I noticed was that console.* statements with ':' in them are not matched against. For example:

    console.warn(" -> setup.tester: step="+this.step);

    is not stripped out of the output, when I want it to be. I thought I'd need something like this:

    (?<!:)console\.(?:warn|log|error|con|info)(\((*[^()]+)|(?{1}))*\));

    What I'm trying (and probably failing) to express here in the regular expression is that anything goes as a parameter to the console.* statements. i.e.

    console\.(?:warn|log|error|con|info)(\("*")\);

    For example, console.* statements can start and end like this:

    console\.(?:warn|log|error|con|info)(\("All characters can be used as an input to a console statement")\); OR

    console\.(?:warn|log|error|con|info)(\('All characters can be used as an input to a console statement')\);

    Sean

  •  08-15-2008, 1:34 AM 45372 in reply to 45366

    Re: Stripping Logging Statements in JavaScript

    Sean,

    I'll take the parts one at a time.

    (?<![?:]) - You are correct that the '[?:]' part will match either a question mark or a colon.

    The '(?<!' part is a negative lookbehind. This means that the regex engine will begin at its current point in the source text and look backwards, trying to match the characters. In this case it is looking to see if the character just before where it is up to is a '?' or a ':'. If it IS one of these, then it will return a 'fail' because it is a negative lookbehind. If the previous character is NOT one of these, then it will return a 'success'. The trick with lookaheads and lookbehinds is that the take part in a match but they don't move the 'current point' in the text - they only look.

    With Perl, lookbehinds are supported but they must be of fixed length - in other words you can't have quantifiers like '+' and '*' in them etc.. In this case the lookbehind is always a single character long and so this is acceptable.

    Therefore the total effect of this part of the pattern is to see if the preceeding character is a question mark or a colon.

    As for the '^' representing 'not', this is true ONLY in a character class where it says that the character class represents all characters except those that are explicitly specified. Outside a character class, '^' means the start of a line or the whole text, depending on the setting of the 'multiline' option.

    Therefore we can put the above together and look at your replacement sub-pattern of '([^?:])'. What this will do is to create a match grup (the outer parenthesis) that contains a single character class that will match and capture all characters that are not a question mark or a colon. This will work (except when the following text is actually at the start of the text/line) but it will include the non- colon/question-mark in your overall capture. When you come to do the replacement, the non- question mark/colon will be removed along with everything else wcih is NOT what you want.

    I think you have got the next bit (the 'console\........' part) sorted OK.

    \(("[^"]*"|[^)])*\)

    We can start by stripping off the '\(' and '\)' from the beginning and the end - these simply match literal open and closing parentheses.

    Whats left is a match group (the un-escaped parentheses) that contain two alternate sub-patterns.

    "[^"]*" - match a qouble-quote and then match as many characters as possible as long as they are not a double-quote and then match a double-quote. This is a fairly standard way of handling a quoted string and skipping over everything between the double-quotes - especially characters (such as a close parenthesis) that could cause problems in the curcumstances.

    [^)] - match any character that is not a close parenthesis.

    In this situation you are looking for an open parenthsis (already dealt with) and then skip everything until you get to the next close parenthesis but ignoring any close parenthesis that may occur within a double-quoted string. As the regex engine starts this part of the pattern, it is positioned just after the open parenthesis. Lets say we find a double-quote: the first alternate will match and do we skip to the next double-quote that ends the string - treating everything in between as a single 'item'. We heve then satisfied this alternate path and so we have satisfied the whole match group. Next we get to the '*' quantifier and so we start over again.

    Lets say the next character is neither a double-quote or the close parenthesis (I think this occurred in part of your example text). The regex engine will be positioned just after the last double-quote found above and just before this 'other' character. The alternate patterns are searching in the order they are specified and so the engine will check to see if it is a double-quote - whichi it isn't and so it will try the second alternative. The 2nd alternative is a character class that will match anything that is not a close parenthesis. In this case we will get a match and so the match group as a whole is again satisfied.

    While we are here, lets immagine that the alternate patterns are swapped around and see what would happen if the next character was a double-quote. THe first alternative is now the check for anything that is not a close-parenthesis and so the double-quote will match and the whole match group will succeed. However we have just matched the start of a quoted string and so we have missed the chance to detect it. This demonstrates the general rule that alternates should be specified in the most-specified to the most general order.

    OK, lets now imagine that we have matched everything up to and including the character before the close parenthesis and we are at the end of the match group. We are still working under the influence of the '*' quantifier and so we start again. Now the next source character will be the close parenthesis and we try to match it with the first of the alternatives (the double-quote) and so this will not match: we skip to the econd alternative. This won't match either as is IS a close parenthesis and so the match group as a whole will fail.

    However, the '*' will match 0 or more instances of the preceding item (our match group) and so everything is still OK. if we have matched something, then we have a match of more than 0 and so the overall effect is 'success'. If the source text was "()" then we will hve matched 0 times, but that is still OK and so we move on.

    We then match with the literal close parenthesis at the end of this part of the pattern and so all is well.

    As for the syntax errors, I don't know enough of Perl to help you. However, experience with regex syntax error normally show up in one of 2 forms: those form the language parser when the pattern has not been presented in a form that it can process (e.g. you have to pass a string to the PCRE functions amd you have to play with the pattern if it contains double-quotes that the langauge parser will trip up on), and then the resulting regex pattern is not understood by the regex engine itself. Normally the error messages will give you enough of a context to tell which is which.

    Susan

     

  •  08-15-2008, 7:14 AM 45379 in reply to 45369

    Re: Stripping Logging Statements in JavaScript

    From wikipedia it seems that you need to use Perl 5.10 for that support (it fails on my 5.8 here so I am assuming the wiki comment might be correct).

    http://en.wikipedia.org/wiki/Talk:Perl_Compatible_Regular_Expressions#Recursive_expressions

    To see the given pattern at work in PHP PCRE:

    http://www.myregextester.com/?r=283


  •  08-15-2008, 3:54 PM 45389 in reply to 45379

    Re: Stripping Logging Statements in JavaScript

    Thanks for all your help with this issue ddrudik - I appreciate you taking time to parse through my question.
  •  08-15-2008, 3:55 PM 45390 in reply to 45372

    Re: Stripping Logging Statements in JavaScript

    Susan,

    Thanks for your informative reply. I didnt know about those negative lookbehinds - I should take a closer look at them.

    Cheers

    Sean

View as RSS news feed in XML