- For the Wikipedia guideline, see Wikipedia:Disambiguation.
In computational linguistics, word sense disambiguation (WSD) is the problem of determining in which sense a word having a number of distinct senses is used in a given sentence. For example, consider the word bass, two distinct senses of which are: Terrace House, home to WSDs Department of Computer Graphic Design. ...
Computational linguistics is an interdisciplinary field dealing with the statistical and logical modeling of natural language from a computational perspective. ...
A word sense is one of the meanings of a word. ...
For other uses, see Word (disambiguation). ...
In linguistics, a sentence is a unit of language, characterized in most languages by the presence of a finite verb. ...
- a type of fish
- tones of low frequency
and the sentences: - I went fishing for some sea bass
- The bass part of the song is very moving
To a human it is obvious the first sentence is using the word bass in sense 1 above, and in the second sentence it is being used in sense 2. But although this seems obvious to a human, developing algorithms to replicate this human ability is a difficult task. In mathematics, computing, linguistics, and related disciplines, an algorithm is a finite list of well-defined instructions for accomplishing some task that, given an initial state, will terminate in a defined end-state. ...
Difficulties One problem with word s ense disambiguation is deciding what the senses are. In cases like the word bass above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and in such cases division of words into senses becomes much more difficult. Different dictionaries will provide different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work. This article is about metaphor in literature and rhetoric. ...
In rhetoric, metonymy is the substitution of one word for another word with which it is associated. ...
Another problem is inter-judge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand — give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful. In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...
Difficulties One problem with word s ense disambiguation is deciding what the senses are. In cases like the word bass above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and in such cases division of words into senses becomes much more difficult. Different dictionaries will provide different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work. This article is about metaphor in literature and rhetoric. ...
In rhetoric, metonymy is the substitution of one word for another word with which it is associated. ...
Another problem is inter-judge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand — give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful. In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...
Difficulties One problem with word s ense disambiguation is deciding what the senses are. In cases like the word bass above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and in such cases division of words into senses becomes much more difficult. Different dictionaries will provide different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work. This article is about metaphor in literature and rhetoric. ...
In rhetoric, metonymy is the substitution of one word for another word with which it is associated. ...
Another problem is inter-judge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand — give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful. In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...
Difficulties One problem with word s ense disambiguation is deciding what the senses are. In cases like the word bass above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and in such cases division of words into senses becomes much more difficult. Different dictionaries will provide different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work. This article is about metaphor in literature and rhetoric. ...
In rhetoric, metonymy is the substitution of one word for another word with which it is associated. ...
Another problem is inter-judge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand — give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful. In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...
Difficulties One problem with word s ense disambiguation is deciding what the senses are. In cases like the word bass above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and in such cases division of words into senses becomes much more difficult. Different dictionaries will provide different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work. This article is about metaphor in literature and rhetoric. ...
In rhetoric, metonymy is the substitution of one word for another word with which it is associated. ...
Another problem is inter-judge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand — give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful. In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...
Difficulties One problem with word s ense disambiguation is deciding what the senses are. In cases like the word bass above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and in such cases division of words into senses becomes much more difficult. Different dictionaries will provide different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work. This article is about metaphor in literature and rhetoric. ...
In rhetoric, metonymy is the substitution of one word for another word with which it is associated. ...
Another problem is inter-judge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand — give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful. In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...
Difficulties One problem with word s ense disambiguation is deciding what the senses are. In cases like the word bass above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and in such cases division of words into senses becomes much more difficult. Different dictionaries will provide different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work. This article is about metaphor in literature and rhetoric. ...
In rhetoric, metonymy is the substitution of one word for another word with which it is associated. ...
Another problem is inter-judge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand — give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful. In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...
Difficulties One problem with word s ense disambiguation is deciding what the senses are. In cases like the word bass above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and in such cases division of words into senses becomes much more difficult. Different dictionaries will provide different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work. This article is about metaphor in literature and rhetoric. ...
In rhetoric, metonymy is the substitution of one word for another word with which it is associated. ...
Another problem is inter-judge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand — give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful. In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...
Difficulties One problem with word s ense disambiguation is deciding what the senses are. In cases like the word bass above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and in such cases division of words into senses becomes much more difficult. Different dictionaries will provide different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work. This article is about metaphor in literature and rhetoric. ...
In rhetoric, metonymy is the substitution of one word for another word with which it is associated. ...
Another problem is inter-judge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand — give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful. In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...
Difficulties One problem with word s ense disambiguation is deciding what the senses are. In cases like the word bass above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and in such cases division of words into senses becomes much more difficult. Different dictionaries will provide different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work. This article is about metaphor in literature and rhetoric. ...
In rhetoric, metonymy is the substitution of one word for another word with which it is associated. ...
Another problem is inter-judge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand — give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful. In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...
Difficulties One problem with word s ense disambiguation is deciding what the senses are. In cases like the word bass above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and in such cases division of words into senses becomes much more difficult. Different dictionaries will provide different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work. This article is about metaphor in literature and rhetoric. ...
In rhetoric, metonymy is the substitution of one word for another word with which it is associated. ...
Another problem is inter-judge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand — give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful. In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...
Difficulties One problem with word s ense disambiguation is deciding what the senses are. In cases like the word bass above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and in such cases division of words into senses becomes much more difficult. Different dictionaries will provide different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work. This article is about metaphor in literature and rhetoric. ...
In rhetoric, metonymy is the substitution of one word for another word with which it is associated. ...
Another problem is inter-judge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand — give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful. In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...
Difficulties One problem with word s ense disambiguation is deciding what the senses are. In cases like the word bass above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and in such cases division of words into senses becomes much more difficult. Different dictionaries will provide different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work. This article is about metaphor in literature and rhetoric. ...
In rhetoric, metonymy is the substitution of one word for another word with which it is associated. ...
Another problem is inter-judge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand — give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful. In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...
Difficulties One problem with word s ense disambiguation is deciding what the senses are. In cases like the word bass above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and in such cases division of words into senses becomes much more difficult. Different dictionaries will provide different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work. This article is about metaphor in literature and rhetoric. ...
In rhetoric, metonymy is the substitution of one word for another word with which it is associated. ...
Another problem is inter-judge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand — give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful. In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...
Difficulties One problem with word s ense disambiguation is deciding what the senses are. In cases like the word bass above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and in such cases division of words into senses becomes much more difficult. Different dictionaries will provide different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work. This article is about metaphor in literature and rhetoric. ...
In rhetoric, metonymy is the substitution of one word for another word with which it is associated. ...
Another problem is inter-judge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand — give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful. In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...
Approaches As in all natural language processing, there are two main approaches to WSD — deep approaches and shallow approaches. Natural language processing (NLP) is a subfield of artificial intelligence and linguistics. ...
Deep approaches presume access to a comprehensive body of world knowledge. Knowledge such as "you can go fishing for a type of fish, but not for low frequency sounds" and "songs have low frequency sounds as parts, but not types of fish" is then used to determine in which sense the word is used. These approaches are not very successful in practice, mainly because such a body of knowledge does not exist in computer-readable format outside of very limited domains. But if such knowledge did exist, they would be much more accurate than the shallow approaches. However, there is a long tradition in Computational Linguistics of trying such approaches in terms of coded knowledge, and in some cases it is hard to say clearly whetrher the knowledge involved is linguistic or world knowledge. The first attempt was that by Margaret Masterman and her colleagues at Cambridge Language Research Unit in England in the 1950s. This used as data a punched-card version of Roget's Thesaurus and its numbered "heads" as indicators of topics and looked for their repetitions in text, using a set intersection algorithm: it was not very successful (and is described in some detail in (Wilks, Y. et al., 1996) but had strong relationships to later work, especially Yarowsky's machine learning optimisation of a thesaurus method in the 1990s (see below). Commonsense reasoning is the branch of Artificial intelligence concerned with replicating human thinking. ...
Shallow approaches don't try to understand the text. They just consider the surrounding words, using information like "if bass has words sea or fishing nearby, it probably is in the fish sense; if bass has the words music or song nearby, it is probably in the music sense." These rules can be automatically derived by the computer, using a training corpus of words tagged with their word senses. This approach, while theoretically not as powerful as deep approaches, gives superior results in practice, due to computers' limited world knowledge. It can, though, be confused by sentences like The dogs bark at the tree, which contains the word bark near both tree and dogs. These approaches normally work by defining a window of N content words around each word to be disambiguated in the corpus, and statistically analyzing those N surrounding words. Two shallow approaches used to train and then disambiguate are Naïve Bayes classifiers and decision trees. In recent research, kernel based methods such as support vector machines have shown superior performance in supervised learning. But over the last few years, there hasn't been any major improvement in performance of any of these methods. A naive Bayes classifier (also known as Idiots Bayes) is a simple probabilistic classifier based on applying Bayes theorem with strong (naive) independence assumptions. ...
In operations research, specifically in decision analysis, a decision tree is a decision support tool that uses a graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. ...
Support vector machines (SVMs) are a set of related supervised learning methods used for classification and regression. ...
Supervised learning is a machine learning technique for creating a function from training data. ...
It is instructive to compare the word sense disambiguation problem with the problem of part-of-speech tagging. Both involve disambiguating or tagging with words, be it with senses or parts of speech. However, algorithms used for one do not tend to work well for the other, mainly because the part of speech of a word is primarily determined by the immediately adjacent one to three words, whereas the sense of a word may be determined by words further away. The success rate for part-of-speech tagging algorithms is at present much higher than that for WSD, state-of-the art being around 95% accuracy or better, as compared to less than 75% accuracy in word sense disambiguation with supervised learning. These figures are typical for English, and may be very different from those for other languages. Part-of-speech tagging is the process of marking up the words in a text with their corresponding parts of speech. ...
Another aspect of word sense disambiguation that differentiates it from part-of-speech tagging is the availability of training data. While it is relatively easy to assign parts of speech to text, training people to tag senses is far more difficult [1]. While users can memorize all of the possible parts of speech a word can take, it impossible for individuals to memorize all of the senses a word can take. Thus, many word sense disambiguation algorithms use semi-supervised learning, which allows both labeled and unlabeled data. The Yarowsky algorithm was an early example of such an algorithm. In computer science, semi-supervised learning is a class of machine learning techniques that make use of both labeled and unlabeled data for training - typically a small amount of labeled data with a large amount of unlabeled data. ...
Notes - ^ Fellbaum, C. 1997. Analysis of a handtagging task. Proceedings of ANLP-97 Workshop on Tagging Text with Lexical Semantics: Why, What, and How? Washington D.C., USA.
Wilks, Y., Slator, B., Guthrie, L. (1996) Electric Words: dictionaries, computers and meanings. Cambridge, MA: MIT Press. Christiane Fellbaum, born in Braunschweig, has lived since 1969 in the United States. ...
External links |