Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Regular Exp exclude a string match

Last post 10-31-2009, 12:16 PM by ftg. 4 replies.
Sort Posts: Previous Next
  •  10-23-2009, 2:33 PM 56930

    Regular Exp exclude a string match

    Hi,

     I'm working on Regular Expressions via VBScript, and trying to create a pattern. Suppose my string is this:

    switch
    CALC_TYPE:is_ad_intup_flow  :: POD_BCD:is_ad_intup_bcd ;
    CALC_TYPE:is_ad  :: INPUT_DTRV:date_ad_flow ;
    CALC_TYPE:is_def_term  :: POD_BCD:bcd_def_term ;
    ((CALC_TYPE:is_ce_cp AND INPUT_SELC:slct_dfn_379528 )AND CALC_TYPE:is_not_pd ) :: INPUT_DTRV:dt_bcd ;
    CALC_TYPE:is_bc_call  :: bcd_bc_call ;
    CALC_TYPE:is_pspest_psquot  :: input_dot_plus_1_day ;
    CALC_TYPE:is_renroll  :: POD_BCD:new_expect_bcd ;
    CALC_TYPE:is_r1_dp  :: r1_bcd ;
    CALC_TYPE:is_qp_qa_ch  :: INPUT_DTRV:dt_bcd ;
    ec_in_bcd_elc_high_dt  :: max_bcd_cd_intup ;
    in_bcd_elc_high_dt  :: in_bcd_dflt ;
    default  :: POD_BCD:input_bcd ;
    endswitch

     I want to match viable worksheets like CALC_TYPE:is_ad_intup_flow,CALC_TYPE:is_ad ,INPUT_DTRV:date_ad_flow .. etc Except worksheets that start with POD_BCD (POD_BCD:is_ad_intup_b). Viable worksheets have to start with an alpha charcter (with the alpha character in caps and could have any combination of [A-Z0-9_] after that)

    Hope i wasn't too confusing any help would be great.

     Thanks

     

  •  10-26-2009, 11:20 PM 56962 in reply to 56930

    Re: Regular Exp exclude a string match

    For my part, I AM confused - sorry.

    I'm assuming that you are wanting to process this file line by line and keep only those lines that are "valid".

    What I'm not clear about is how each line is made up. Are you saying that you want to reject any line that has the string "POD_BCD" anywhere in it?(If so then a string handling function will do this without the overheads of a regex in most programming langauges)

    If it must be in a specific place within the line, I'm not sure how to identify where that specific place is (after the double colons and space???).

    Perhaps if you showed us a "before" (which I think you have) and an "after" of the test it  might help. Also, when you mention things like "worksheets" etc, it might help to tell us how to find the worksheet name that is viable.

    Finally, I'm not sure what the importance of the structure of the worksheet name is in your question.

    Susan

  •  10-27-2009, 12:03 PM 56970 in reply to 56962

    Re: Regular Exp exclude a string match

    Hi Susan,

     Thanks much for the reply.

    Currently i'm using this pattern to test agianst my string:

    .Pattern = "[A-Z]([A-Z0-9_])*:[a-z]([a-zA-Z0-9_])*"

    When executed to extract the matches it gives me output like this:

    CALC_TYPE:is_ad_intup_flow
    POD_BCD:is_ad_intup_bcd
    CALC_TYPE:is_ad
    INPUT_DTRV:date_ad_flow
    CALC_TYPE:is_def_term
    POD_BCD:bcd_def_term
    CALC_TYPE:is_ce_cp
    INPUT_SELC:slct_dfn_379528
    CALC_TYPE:is_not_pd
    INPUT_DTRV:dt_bcd
    CALC_TYPE:is_bc_call
    CALC_TYPE:is_pspest_psquot
    CALC_TYPE:is_renroll
    POD_BCD:new_expect_bcd
    CALC_TYPE:is_r1_dp
    CALC_TYPE:is_qp_qa_ch
    INPUT_DTRV:dt_bcd
    POD_BCD:input_bcd

    I was just trying to figure out if there was anyway to exclude matches beginning with POD_BCD, from the list. Hope this was clear.

    Thanks again

     

  •  10-27-2009, 6:11 PM 56991 in reply to 56970

    Re: Regular Exp exclude a string match

    Try:

    \b(?!POD_BCD)[A-Z]([A-Z0-9_])*:[a-z]([a-zA-Z0-9_])*

    What this does is to check that there is something before the name that is not part of the name (I'll explain shortly!) then checks that what follows is NOT "POD_BCD" and finally uses your pattern to match whatever is there.

    The simple thing is to simply put the negative lookahead at the start of your pattern. Unfortunately this is defeated by the natural behaviour of the regex engine which is to advance the starting point of a match if the pattern as a whole fails. If you leave off the '\b', what happens is that the match keeps failing until it gets to the "POD_BCD :is_ad_intup_bcd" (for example). AT this point the negative lookahead will fail and so the "P" will be skipped. Now the negative lookahead will succeed and so with the rest of the pattern, thereby returning a match of 'OD_BCD :is_ad_intup_bcd".

    Therefore you need a way to force the first part of the name to be treated as a whole. Atomic groups are an option but without something to anchor the start of the match they are not effective in this case. Another alternative is to use a negative lookbehind after the first part of the name has been matched (atomically) and the ":" found - but VBScript does not lookbehinds.

    Therefore the '\b' acts as an anchor at the start of the name that does not allow a partial match of the first part that starts from the 2nd letter.

    I must admit that I'm normally a bit wary of the '\b' anchor as it can work in strange ways in some circumstances (what the regex considers a "word" character is not always what we might think) but in this case it should be alright.

    I hope this makes sense.

    Susan

  •  10-31-2009, 12:16 PM 57066 in reply to 56991

    Re: Regular Exp exclude a string match

    Susan,

     Thanks very much for the solution and the explanation. It works great!!!!

     Thanks again

View as RSS news feed in XML