Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Heading Text from Text File

Last post 05-18-2011, 11:17 PM by Aussie Susan. 1 replies.
Sort Posts: Previous Next
  •  05-18-2011, 2:49 AM 82138

    Heading Text from Text File

    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:12.0pt; mso-bidi-font-size:10.0pt; font-family:"Times New Roman","serif"; mso-bidi-font-family:"Courier New";}

    Hai sir/friends

    Text File Content as Follows:

    Experience performing various complex ETL transformations in a variety of tools from Informatica to bare PLSQL/TSQL. Good exposure to data interface design with disparate source systems & automation.

    Experience:         

    XXX Company Design, develop and maintain the analysis reports requested by the marketing team and higher management for business forecasting and trend analysis.
    Analyze the transactional and analytic data and determine the individual relationships and components for reporting.

    Explanation:

    I need Header Text(Experience from Text File)

    My current Regex Picks the yellow content But i need Experience from red content

    strPattern=” Experience

     Regex regex = new Regex("^[^a-zA-Z0-9]*" + Regex.Escape(strPattern) + ".*$",

    RegexOptions.IgnoreCase | RegexOptions.Multiline);

    I need Heading Text only like Experi

    Can you provide regex that match Red Content(Experience)?

     

    Thank You in advance


    With Regards,
    Maha
  •  05-18-2011, 11:17 PM 82147 in reply to 82138

    Re: Heading Text from Text File

    You need to be clear in your own mind how you can specify which "experience" you want. Try sitting down with a pencil and paper and writing down the "rules" you would tell someone else as to how to pick out the right one. I'm guessing here but from your example text these rules might be:

    - the work must start at the beginning of a line
    - the work bust be "experience" (ignoring the case of the letters)
    - there must be a colon immediately after the word

    You then go through a number of examples of text and see if these rules always find just the words you are looking for - if not then go back and refine them until the do.

    Assuming that the rules above are necessary and sufficient (that term means that each one must be there - necessary - and there are no others needed - sufficient) then you can create a pattern from them - say:

    ^experience:

    with the "ignore case" and "multiline" options set (as you have done).

    You will also not that this specifies ONLY the part that needs to be matched plus whatever else is around it to make sure that it is in the correct context - in this case that it should begin at start of the line and be followed by colon. In your pattern you include other things that are causing the regex engine to match more than you expect. For a start, the '[^a-zA-Z0-9]*' will let it match all sorts of special characters before the target word - this may be required but I can't see why based only on the example text you have provided.

    Also, the '.*$' the tell the regex engine to include everything from the end of the word to the end of the current line. (By the way, when i tried your pattern against this text, it actually matched to the end of "automation." because that is how the "line" is presented - again I can only guess that  the line appeared to the regex engine to be split after "from" as you  have indicated).

    A basic rule in writing regex patterns is that you don't include something if you don't want to use it in a match - in this case, if you don't want the rest of the line then don't include something in the pattern that will do just that. Of course, if you DO want the rest of the line, then that is fine but, again based on your example text and what you said you wanted to match, this is not the case here.

    Susan

View as RSS news feed in XML