Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Split mysql queries by semi-colon

Last post 07-25-2012, 4:51 AM by killahbeez. 2 replies.
Sort Posts: Previous Next
  •  07-24-2012, 3:33 PM 85871

    Split mysql queries by semi-colon

    I'm trying to split a large chunk of text containing multiple mysql statements separated by a semi-colon (;). I can't figure out how to split only when the semi-colon is found NOT BETWEEN two single quotes.

     

    For example:

    DROP TABLE mytable;
    CREATE TABLE ... ();
    INSERT INTO TABLE ... VALUES ('the statement should not break with ; <-- that semi-colon');

     Any ideas? I've seen one regex that people seem to have claimed to work, but it's crashing under Fast CGI.

  •  07-24-2012, 9:44 PM 85876 in reply to 85871

    Re: Split mysql queries by semi-colon

    It would help if we knew the regex variant you are using and the pattern that you tried that doesn't work.

    Doing this type of split with a regex is rather difficult because of the fact that you need to concentrate on locating a specific character string to split on rather than on the characters you want to separate from the main text.

    For example:

    ('[^']*'|[^;])+

    will find the 3 complete statements in your text because it is targeted to the text between the semi-colons and can use the fact that, once you see the initial single-quote you know that you  can simply grab everything until you see the next single quote and it will all be part of the same statement.

    If you need to use the 'split' function, then you can't hide the embedded semi-colons in the same way. There are some "tricks" you can pull such as

    (('[^']*')|;)(?(2)(?!))

    which matches either a complete quoted string or a semi-colon and then rejects the match if the quoted string is matched - but this fails because as soon as the quoted-string is rejected, the action of the regex engine is to step forward 1 character (i.e. over the initial single-quote) and to try again - of course it will ultimately succeed when it finds the semi-colon in the quote which is not wanted.

    The other traditional way to FIND a quoted character is to use lookaheads and lookbehinds to verify that the located character has quotes (within the same statement) both before and after it. This works (provided the regex variant you are using can have variable length lookbehinds - and very few do) because the pattern will match only if BOTH the lookbehind AND the lookahead match. The "AND" operation is the natural one for regex patterns:

    (?<='[^';]*);(?=[^';]*')

    You can now apply De Morgan's laws to the expression and you end up with:

    (?<!'[^';]*);|;(?![^';]*')

    This works for many situations but there is a major flaw if you have two consecutive statements that have a quoted semi-colons - it will see the actual end of statement semicolon of the first statement as being followed by a single quote from the next statement.

    In fact just about everything I've through of comes unstuck in that situation.

    Given that there is a "simple" solution (using the "match" rather than the "split") I'm not sure there is much value is pursuing this unless your situation is that the "split" operation is the ONLY option.

    If it is, then I would suggest that you take your example but with the last line duplicated and then sit down with a pencil and paper and work out the "rules" that you would apply manually to identify all of the semi-colons you want to "split" on and none of the others. Once you have the rules that apply to all situations you may encounter, then tell us what they are and we may be able to convert those to a regex pattern for whatever regex variant you are using.

    Susan

  •  07-25-2012, 4:51 AM 85879 in reply to 85871

    Re: Split mysql queries by semi-colon

    Or split by

    (?<=;)(?=(?:([^']*+)'(?1)'(?1))+$|(?1)$)


    http://portal-vreme.ro
View as RSS news feed in XML