Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

RegEx to find misspelled words

Last post 11-05-2009, 11:18 AM by mash. 1 replies.
Sort Posts: Previous Next
  •  11-05-2009, 6:05 AM 57159

    RegEx to find misspelled words

    I need a RegEx that will find a specific word within a long string. The issue is that this word may be misspelled and I need to find it even so. I would like to accept a certain percentage of wrong-ness when looking for the word. Ex.

    The complete string: Hello, this is my comp/et sting to look at
    The word to search for: complete

    Let’s say that I which to accept a maximum of two wrong letters in the above string, then the RegEx should match the word complete. However, if I only accept 1 wrong letter it shouldn’t find it. Ideally the RegEx would also be able to handle whitespaces, and missing letters. Ex:

    The complete string: Hello, this is my com plee sting to look at
    The word to search for: complete

    This should match the word as well, even though there is a whitespace between ‘m’ and ‘p’ and the letter ‘t’ is missing. Is this possible at all with RegEx or should I be looking at an alternative way to solve it?

    Thanks, Tommy

  •  11-05-2009, 11:18 AM 57164 in reply to 57159

    Re: RegEx to find misspelled words

    IMHO this is not a task suited for regular expressions.  Regexes do not understand the context of your data. Regexes simply matches a pattern within a string. It will not understand any language nor how words in that language should be spelled. You have no pattern to match.

    Second regexes are used for matching, if you have a search term finding things that don't match that term is not something a regex was designed to do.

     Plus there is a logic flaw in your idea. What happens with you search for a word that can have other valid words within it.

    What if your string was "Can you do or say something."

    and the word search was "donor"

    There is nothing misspelled but according to your logic it should find "do or"


    Michael

    "In theory, theory and practice are the same. In practice, they are not."
    Albert Einstein
View as RSS news feed in XML