Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

RegEx to identify coding standard issues

Last post 06-26-2012, 7:16 PM by Aussie Susan. 1 replies.
Sort Posts: Previous Next
  •  06-26-2012, 5:58 AM 85575

    RegEx to identify coding standard issues

    Hi all,

    I'm writing a C# application that identifies coding standards issues in a bespoke language (Siebel eScript, if you're interested - it's based on ECMAScript).

    Two of the items I want to address are:

    1. Attempting to use 'return' within a finally block:

    ...
    catch(e)
    {
    logError(e);
    }
    finally
    {
    if (e.ToString() != "")
    {
    // Return in finally - this is bad
    return (false);
    }
    else
    {
    // Return in finally - this is bad
    return (true);
    }
    }
    return(true)
    }

    2. Initialising an object type witout destroying it by setting in to null:

    ...
    var oObject1 = TheApplication().NewPropertySet();
    var oObject2 = TheApplication().GetBusObject("Account);
    var oObject3 = oObject2.GetBusComp("Contact");
    ...
    // Failed to set oObject1 to null
    oObject2 = null;
    oObject3 = null;
    I currently use some horrible string manipulation to address item 1 (using indexOf and subString) and I haven't even begun to fathom how to achieve item 2.

    I was wondering if there are RegEx expressions that I could use to identify code scripts that meet the above contraints? The input would be an entire function / script and I really only need a 'yes' or 'no' for each item, as to whether the script passed in violates the rule or not.

    Any thoughts greatly appreciated.

    Regards,

    mroshaw
  •  06-26-2012, 7:16 PM 85592 in reply to 85575

    Re: RegEx to identify coding standard issues

    My advice would be "don't even think of using a regex". This type of problem is really not in the regex domain because there are too many variables (no pun intended) to consider.

    As an example, take your requirement #1 to see if there is a "return" statement within a "finally" block. For a start you need to make sure that you pick up the "return" that is the statement and not the word in the comment (see the line above in your example) - this involves coding into the pattern the ability to skip over comments that can occur anywhere within the source text.

    Once you have that solved, you need to be able to find the limits of the "finally" block. If you look at your example, that means finding the "}" that matches the "{" that occurs immediately (accounting for any intervening whitespace and comments) after the "finally" keyword (again being careful not to pick up some extraneous character string that is not the required keyword). To do that you need to be able to count the "{"s and the corresponding "}"s and make sure they match. The vast majority of regex variants cannot do this - it is referred to as the "matching parentheses problem" - and, as far as I know, only the .NET and PCRE regex libraries have the necessary (incompatible) extensions to the pattern syntax to allow this [Perl 6 is supposed to have this ability as well but I'm not sure the current state of the Perl 6 for production use].

    Also, considering your second problem, you will need to consider all of the possible code paths through the program. For example, if you have an "if/then/else" structure and the "object=null" statement occurs in the "else" clause, does this meet your needs or not as the code path through the "then" block would not set the object to null.

    If you really want to do this, I would suggest that you write a parser for your target language (you may even find that one exists in some language such as Lex that you can use directly) and then start processing the semantic tree that it outputs.

    Susan

View as RSS news feed in XML