Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Tokenize SQL backup

Last post 05-21-2009, 2:32 PM by Sergei Z. 11 replies.
Sort Posts: Previous Next
  •  05-21-2009, 7:12 AM 53211

    Tokenize SQL backup

    Hello!

    With your help I've stripped my SQL code from comments.

    So now I basically want to split file into SQL statement delimited by ;

    But I want the delimiter to be ignored inside quotes.

    For example:

    SELECT * FROM `my table` WHERE `name` = 'some; name';

    SELECT * FROM `my second table` WHERE `name` = 'some other name';

    I want it to get only 2 statemets with 'some; name' intact.

    The other problem is that quetes could be single and double.

     Can anyone help me with this?

     

    Leonti

  •  05-21-2009, 10:49 AM 53215 in reply to 53211

    Re: Tokenize SQL backup

    go to our PlayGround Regex Tester: http://www.myregextester.com/index.php [courtesy of ddrudik]

    and do the following: use .NET flag; then use

    SourceText: SELECT * FROM `my table` WHERE `name` = 'some; ;name';

    Raw Match Pattern:
    (?<='[^']*);(?=[^']*')

    Raw Replace Pattern:
    <SemiColon>

    Source text after replacement:
    SELECT * FROM `my table` WHERE `name` = 'some<SemiColon> <SemiColon>name';

    then: split your SQL by Semicolon
    then: replace all <SemiColon> with *;*

  •  05-21-2009, 1:21 PM 53225 in reply to 53215

    Re: Tokenize SQL backup

    Thanks, it works as intended, but how do I convert it from .NET to usual regex?

    I use regEx from wxWidgets and it doesn't accept this notation.

     

    Leonti

     

  •  05-21-2009, 1:30 PM 53226 in reply to 53225

    Re: Tokenize SQL backup

    u have 2 choices:

    1. find a person who knows wxWidgets regex flavor ASAP and ask him/her.

    2. read /study wxWidgets regex flavor; should not be a biggy; pay attention to *lookaheads* when u study the docs: if the flavor does not support them , then the offered pattern would not work.

  •  05-21-2009, 1:33 PM 53227 in reply to 53226

    Re: Tokenize SQL backup

    It apparently supports TCL/POSIX BRE/POSIX ERE, none of which supports lookbehinds.

    Can you provide a larger sample of an actual SQL backup file that your are attempting to parse?


  •  05-21-2009, 1:42 PM 53228 in reply to 53227

    Re: Tokenize SQL backup

    Here is the sample:

    INSERT INTO `orders_dishes` (`id`, `order_id`, `qty`, `name`, `price`, `tax`, `tax_id`, `comment`, `kitchen`, `ready`) VALUES
    (471, 79, 1, 'beer; brat', 7, 0.19, 9, 'Tasty beer brat!', 1, 0),
    (472, 79, 1, 'miontain; man brat', 7, 0.19, 9, '', 1, 0),
    (473, 79, 1, 'Pepsi', 1.95, 0.19, 9, '', 1, 0),
    (474, 79, 1, 'Coca-cola', 1.95, 0.19, 9, '', 1, 0),
    (475, 79, 1, 'tea', 1.95, 0.19, 9, '', 1, 0);

      Sergei's approach is working - it finds semicolons inside the quotes, and does not touches them outside - exactlt what I need.

    If only it didn't use .NET notation.

     

    Leonti

     

    P.S. regex is still black magic to me so any help is really appreciated

  •  05-21-2009, 1:47 PM 53229 in reply to 53228

    Re: Tokenize SQL backup

    As I understand your platform can use Tcl Advanced Regular Expressions, in that case you might consider:

    (?m);\s*$

    Although, all I know about your platform is what little I read here:

    http://www.regular-expressions.info/wxwidgets.html

    Sergei's approach unfortunately doesn't seem supported on your more limited platform.


  •  05-21-2009, 1:58 PM 53230 in reply to 53229

    Re: Tokenize SQL backup

    try this Matching pattern:

    ('[^']*'|'|[^;])+;

    it sh give you an array of full SQL statements: tested in ddrudik's tester above:

    Raw Match Pattern:
    ('[^']*'|'|[^;])+;

    $matches Array:
    (
        [0] => INSERT INTO `orders_dishes` (`id`, `order_id`, `qty`, `name`, `price`, `tax`, `tax_id`, `comment`, `kitchen`, `ready`) VALUES
    (471, 79, 1, 'beer; brat', 7, 0.19, 9, 'Tasty beer brat!', 1, 0),
    (472, 79, 1, 'miontain; man brat', 7, 0.19, 9, '', 1, 0),
    (473, 79, 1, 'Pepsi', 1.95, 0.19, 9, '', 1, 0),
    (474, 79, 1, 'Coca-cola', 1.95, 0.19, 9, '', 1, 0),
    (475, 79, 1, 'tea', 1.95, 0.19, 9, '', 1, 0);
        [1] => )
    )

  •  05-21-2009, 2:00 PM 53231 in reply to 53230

    Re: Tokenize SQL backup

    or this [more concise] one:

    ('[^']*'|[^;])+;

  •  05-21-2009, 2:20 PM 53232 in reply to 53231

    Re: Tokenize SQL backup

    Halleluyah!

    Both expressions work!

    Thanks a lot!

     Leonti

  •  05-21-2009, 2:27 PM 53233 in reply to 53232

    Re: Tokenize SQL backup

    u need to take care of double quotes i guess
  •  05-21-2009, 2:32 PM 53234 in reply to 53233

    Re: Tokenize SQL backup

    w/ double quotes:

    ('[^']*'|"[^"]*"|[^;])+;

View as RSS news feed in XML