Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

yet another file path regular expression

Last post 11-16-2007, 12:21 PM by ddrudik. 5 replies.
Sort Posts: Previous Next
  •  11-15-2007, 1:17 PM 36571

    yet another file path regular expression

    Hi,

    I'm using C# 2.0 on Visual Studio 2005.

    I need a regular expression to parse a string with multiple filenames, filepaths, directories and whatnot and separate them. Filepaths often have spaces, so quotes are used to delimit them. There are some invalid characters in a file path, but for this particular application, there's no need to look for them.

    For example, when input :
    "c:\mes documents\tmp\cator\*.castor" "c:\www\" "file.castor"

    There are 3 matches :

    c:\mes documents\tmp\cator\*.castor
    c:\www\
    file.castor

    input will always be single line.

    I use this regular expression : "\"([^\"]*)\"" and it partially works : I'm getting all 3 matches, but with quotes before and after each match.

    In addition, I'd like to have the regular expression also capture the filename if there were no quotes, in that case, each path would be space separated. In fact, filepath validation is not even part of the problem : we can always suppose that the entered paths are valid, and if not, that's a problem I'll solve programmatically ("bad user, bad user!").

    Could you help me with such a regular expression?

    Filed under: , , , ,
  •  11-15-2007, 2:30 PM 36576 in reply to 36571

    Re: yet another file path regular expression

    Please provide an example of your input text that would appear without quotes and specify if such entries may contain spaces in the paths entered.


  •  11-16-2007, 4:42 AM 36603 in reply to 36576

    Re: yet another file path regular expression

    No, there would be no spaces if there were no quotes

    this first example would have the same result as before : i removed the quotes where possible

    "c:\mes documents\tmp\cator\*.castor" c:\www\ file.castor

     This example should have 4 matches. Actually, the user makes a mistake, forgetting the quotes, but that's not a problem I want to solve :

    c:\mes documents\tmp\cator\*.castor c:\www\ file.castor

    matches : 

    c:\mes
    documents\tmp\cator\*.castor
    c:\www\
    file.castor 

     

    After thought, the fact that the strings are filepaths is not even part of the problem. Sometimes the paths are absolute, sometimes they are relative, sometimes there are wildcards (there are 2, they are * (asterisk) and ? (question mark) )

  •  11-16-2007, 8:45 AM 36616 in reply to 36603

    Re: yet another file path regular expression

    You could use something like:

    ".*?"|\S+

    Then strip leading/ending quotation marks from any matches.


  •  11-16-2007, 11:38 AM 36620 in reply to 36616

    Re: yet another file path regular expression

    I did as you suggested and it worked. Thank you very much.

    For reference, here's the final code I used :

    public Queue SplitIntoFilePaths (string filepaths) {
    	Regex regex = new Regex( "\".*?\"|\\S+" );
    	MatchCollection matches = regex.Matches( filepaths );
    	Queue q = new Queue( matches.Count );
    	string val;
    	foreach ( Match m in matches ) {
    		string val = m.Value;
    		if ( val.StartsWith( "\"" ) && val.EndsWith( "\"" ) )
    			val = val.Substring( 1, val.Length - 2 );
    		q.Enqueue( val );
    	}
    	return q;
    }
    
  •  11-16-2007, 12:21 PM 36622 in reply to 36620

    Re: yet another file path regular expression

    Glad it worked out.  Thanks.
View as RSS news feed in XML