Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

matching semicolons outside quotes in sql file

Last post 11-13-2007, 10:37 AM by docdawson. 6 replies.
Sort Posts: Previous Next
  •  11-09-2007, 3:09 AM 36369

    matching semicolons outside quotes in sql file

    Language: PHP
    Task: Split a SQL file after each semicolon which is not in quotes.
    My trials: match space outside quotes I thought would help me out, but there are some strings which fail on this regex:
    Modified Regex: ;(?=[^']*'[^']+';|[^']*$) (modifiers: gm)

    My string:
    $string = "
    -- generated at Wed, 07 Nov 2007 14:18:30 +0100
    SET NAMES  utf8;

    -- Create Table
    DROP TABLE IF EXISTS `foo`;
    CREATE TABLE `foo` (
      `fooID` int(10) unsigned NOT NULL auto_increment,
      `barID` int(10) unsigned NOT NULL default '0',
      PRIMARY KEY  (`fooID`)
    ) ENGINE=MyISAM AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

    -- Data
    INSERT INTO `foo` VALUES (';'. ';', '\';', ''';');
    ";

    For testing smaller string variations which already fail:

    $stringb = "INSERT INTO `foo` VALUES (';',';','','');";


    $stringc = "INSERT INTO `foo` VALUES ('',' d ', ';');
    INSERT INTO `foo` VALUES (';',' d ', ';');";

    The red semicolons should not match but they do. Maybe there are other string constellations which fail too. The main task is to ignore semicolons inside quotes.
    Is there a way to do this with a regex?

    Thanks

  •  11-09-2007, 8:59 AM 36372 in reply to 36369

    Re: matching semicolons outside quotes in sql file

    This page makes me believe that it is not possible with php: http://dotnetslackers.com/Regex/re-19977_Regex_This_regex_splits_comma_or_semicolon_separated_lists_of_optionally_quoted_strings_It_hand.aspx

    I`ve got to do it with a little more programming [:'(] 

    1. read line for line
    2. check if line is complete (I read "only" 4096 bytes and some lines could have much more)
        a) if line complete check for semicolon
              - got semicolon: command complete.
              - no semicolon: build command.
        b) if line not complete go back and check command char for char
              - .....

    If this is totally BS please let me know. I can`t stand SQL statements anymore.
             

    Thanks. 

  •  11-11-2007, 9:03 PM 36410 in reply to 36372

    Re: matching semicolons outside quotes in sql file

    If you areable to do some programming (rather than relying on a split) then what about a pattern of:

    (?:'[^']*')|(;)

    which will find quoted strings of all sorts in match group #1 (also grabbing any quoted semi-colons) and (otherwise exposed) semicolons in match group #2.

    Match the whole string and then work backwards through the set of matches, looking for those where match group #2 is NOT null. For each of these, you can use from the given location to insert a marker character, grab to the end of the text, whatever. If the SQL statements are all in a single line, then you can use this to insert a line terminator at the required point(s). Otherwise, try putting in some character (sequence) that is not otherwise used, and then use a 'split' on that character (sequence).

    Susan 

  •  11-12-2007, 2:57 AM 36413 in reply to 36410

    Re: matching semicolons outside quotes in sql file

    Ohhh thank you Susan, I`ll try that in a few seconds, It think thats the way it will work. I`m just too close-minded to come to such smart conclusions.

    Seconds later.... It`s a great pity that it won`t work with escaped quotes (first example string; sorry, I forget to name that problem explicit) but maybe I`ll find a thread where this was discussed. I remember such a thread....

     

    Edit: Escape sign could be the backslash (\) or the quote (') itself, if possible. If not I`ve got to use only the backslash, what makes my programm a little unflexible. Or I got to use two regex, one for sql files with \-escaped quotes and one for '-escaped quotes.

     

  •  11-12-2007, 7:01 AM 36419 in reply to 36413

    Re: matching semicolons outside quotes in sql file

    I tried and tried..

    My thoughts after your good regex:

    1. find a quote
    2. find a backslash
    3. if backslash take all if not all but quote

    (?:'(?(?=\\)\\.|[^'])*')|(;)

    This has the issue to find semicolons when string was not closed: "INSERT  ............. ' foobar', '.........;"

    (?:'(?(?=\\)\\.|[^'])*'?)|(;) 

    This one seems to work correct. I got to test it a little.

     

    EDIT: arrrrrrrg. doesn`t work in PHP: Compilation failed: nothing to repeat at offset 19
     

  •  11-12-2007, 5:49 PM 36435 in reply to 36419

    Re: matching semicolons outside quotes in sql file

    Try:

    (?:'(\\.|[^'])*')|(;)

    You don't need the lookahead.

    In my testing, this picked up that there are 7 semicolons that you are interested in in your sample text. I had originally 'corrected' what I thought was a typo in your original example with the escapted single-quote - silly me! 

    Susan 

    Edit: Just seen your comment that 2 consecutive single quotes is also an escaped quote: try (?:'(\\.|''|[^'])*')|(;)

  •  11-13-2007, 10:37 AM 36458 in reply to 36435

    Re: matching semicolons outside quotes in sql file

    Great. After a little modification it works (until now Zip it! ).

    ('(?:\\.|''|[^'])*(')?)|(;) 

    I just made the second quote optional because the string I got to check may end with a cutted quote, I forgot to name that problem in my first post. Sorry for that. So

    "INSERT INTO `foobar` VALUES ('string without ending quote; maybe  \'  foobarstuff ...."    should be possible too.

    The way you told me how to do this was a very good hint:
    I`m working with the offset values to get substrings from the main string each time i got a semicolon in the match array. Some special cases I have to watch out for but all in all this regex helped me to make much shorter code (which is still to test Sad ). Thanks again Susan for the big help, take at least this drink as a gift Beer .

     

View as RSS news feed in XML