FACTOID # 13: The United States spends more money on its military than the next 12 nations combined.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RELATED ARTICLES
People who viewed "Likelihood" also viewed:
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Likelihood

In statistics, a likelihood function is a conditional probability function considered a function of its second argument with its first argument held fixed, thus:

and also any other function proportional to such a function. That is, the likelihood function for B is the equivalence class of functions

for any constant of proportionality α > 0. Thus the numerical value L(b|A) is immaterial; all that matters are ratios of the form L(b2|A)/L(b1|A), since these are invariant with respect to the constant of proportionality.


Likelihood as a solitary term is a shorthand for likelihood function. In the colloquial language, "likelihood" is one of several informal synonyms for "probability", but throughout this article we use only the technical definition.


In a sense, likelihood works backwards from probability: given B, we use the conditional probability P(A | B) to reason about A, and, given A, we use the likelihood function P(A | B) to reason about B. This mode of reasoning is formalized in Bayes' theorem; note the appearance of a likelihood function for B given A in:

since, as functions of B, both P(A|B) and P(A|B)/P(A) are likelihood functions for B given A.


For more about making inferences via likelihood functions, see also the method of maximum likelihood, and likelihood-ratio testing.

Contents

Historical remarks

The first use of "likelihood" in the sense explained here and the distinction between likelihood and probability were first made by R.A. Fisher in his paper "On the mathematical foundations of theoretical statistics" (1922). In that paper, Fisher also uses the term "method of maximum likelihood". Fisher argues against inverse probability as a basis for statistical inferences, and instead proposes inferences based on likelihood functions.


Likelihood function of a parametrized model

Among many applications, we consider here one of broad theoretical and practical importance. Given a parametrized family of probability density functions

where θ is the parameter (in the case of discrete distributions, the probability density functions are probability "mass" functions) the likelihood function is

where x is the observed outcome of an experiment. In other words, when f(x | θ) is viewed as a function of x with θ fixed, it is a probability density function, and when viewed as a function of θ with x fixed, it is a likelihood function.


Note: This is not the same as the probability that those parameters are the right ones, given the observed sample. Attempting to interpret the likelihood of a hypothesis given observed evidence as the probability of the hypothesis is a common error, with potentially disastrous real-world consequences in medicine, engineering or jurisprudence. See prosecutor's fallacy for an example of this.


Example

For example, if I toss a coin, with a probability pH of landing heads up ('H'), the probability of getting two heads in two trials ('HH') is pH2. If pH = 0.5, then the probability of seeing two heads is 0.25.


In symbols, we can say the above as

Another way of saying this is to reverse it and say that "the likelihood of pH = 0.5 given the observation 'HH' is 0.25", i.e.,

.

But this is not the same as saying that the probability of pH = 0.5 given the observation is 0.25.


To take an extreme case, on this basis we can say "the likelihood of pH = 1 given the observation 'HH' is 1". But it is clearly not the case that the probability of pH = 1 given the observation is 1: the event 'HH' can occur for any pH > 0 (and often does, in reality, for pH roughly 0.5).


The likelihood function does not in general follow all the axioms of probability: for example, the integral of a likelihood function is not in general 1. This is because integration of the likelihood density function L is performed over all possible values of the model parameters (in this case, pH), while integration of a probability density function f is performed over the random variables (which in this case take on the four pairs of values 'TT', 'TH', 'HT' and 'HH'). In this example, the integral of the likelihood density over the interval [0, 1] in pH is 1/3, demonstrating again that the likelihood density function cannot be interpreted as a probability density function for pH. On the other hand, given any particular value of pH, e.g. pH=0.5, the integral of the probability density function over the domain of the random variables is 1.


See also

References

  • Ronald A. Fisher. "On the mathematical foundations of theoretical statistics". Philosophical Transactions of the Royal Society, A, 222:309-368 (1922). ("Likelihood" is discussed in section 6.)

  Results from FactBites:
 
Likelihood function - Wikipedia, the free encyclopedia (787 words)
For a likelihood function of more than one parameter, it is sometimes possible to write some parameters as functions of other parameters, thereby reducing the number of independent parameters.
Attempting to interpret the likelihood of a hypothesis given observed evidence as the probability of the hypothesis is a common error, with potentially disastrous real-world consequences in medicine, engineering or jurisprudence.
The likelihood function is not a probability density function -- for example, the integral of a likelihood function is not in general 1.
Maximum likelihood - Wikipedia, the free encyclopedia (1223 words)
Likelihood of different proportion parameter values for a binomial process with t = 3 and n = 10; the ML estimator occurs at the mode with the peak (maximum) of the curve.
When maximising the likelihood, we may equivalently maximise the log of the likelihood, since log is a continuous strictly increasing function over the range of the likelihood.
Maximum likelihood estimators achieve minimum variance (as given by the Cramer-Rao lower bound) in the limit as the sample size tends to infinity.
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.