FACTOID # 47: Danish workers strike 150 times more than their German neighbours.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Estimator bias

In statistics, the difference between an estimator's expected value and the true value of the parameter being estimated is called the bias. An estimator or decision rule having nonzero bias is said to be biased. A graph of a bell curve in a normal distribution showing statistics used in educational assessment, comparing various grading methods. ... In statistics, an estimator is a function of the observable sample data that is used to estimate an unknown population parameter; an estimate is the result from the actual application of the function to a particular set of data. ... In probability theory the expected value (or mathematical expectation) of a random variable is the sum of the probability of each possible outcome of the experiment multiplied by its payoff (value). Thus, it represents the average amount one expects as the outcome of the random trial when identical odds are...


Although the term bias sounds pejorative, it is not necessarily used in that way in statistics. Biased estimators may have desirable properties, such as smaller mean squared error than any unbiased estimator, depending on the situation. In statistics the mean squared error of an estimator T of an unobservable parameter θ is i. ...

Contents

Definition

Suppose we are trying to estimate the parameter θ using an estimator (that is, some function of the observed data). Then the bias of is defined to be In statistics, an estimator is a function of the observable sample data that is used to estimate an unknown population parameter; an estimate is the result from the actual application of the function to a particular set of data. ...

In words, this would be "the expected value of the estimator minus the true value θ". This may be rewritten as

which would read "the expected value of the difference between the estimator and the true value" (the expected value of θ is θ).


Examples

Suppose X1, ..., Xn are independent and identically distributed random variables with expectation μ and variance σ2. Let

be the "sample average", and let

be a "sample variance". Then S2 is a "biased estimator" of σ2 because

Note that when a transformation is applied to an unbiased estimator, the result is not necessarily itself an unbiased estimate of its corresponding population statistic. That is, for a non-linear function f and an unbiased estimator U of a parameter p, f(U) is usually not an unbiased estimator of f(p). For example the square root of the unbiased estimator of the population variance is not an unbiased estimator of the population standard deviation. In mathematics, a square root of a number x is a number whose square (the result of multiplying the number by itself) is x. ... In probability theory and statistics, the variance of a random variable (or equivalently, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ... In probability and statistics, the standard deviation of a probability distribution, random variable, or population or multiset of values is defined as the square root of the variance. ...


Bias is not the only consideration when choosing a statistic, however. Bias refers to the central tendency of the sampling distribution of a statistic, but the variance of the sampling distribution can also be an important consideration. Specifically, statistics with smaller sampling variances will yield greater statistical power. For example, while S2 above is more biased than the traditional sample calculation The power of a statistical test is the probability that the test will reject a false null hypothesis, or in other words that it will not make a Type II error. ...

S2 has a lower estimation variability than S2sample because the denominator dividing the sum of squares is larger in the calculation of S2, resulting in a smaller scale of final values, and therefore lower estimation variability, than that of S2sample. Practically, this demonstrates that for some applications (where the amount of bias can be equated between groups/conditions) it is possible that a biased estimator can prove to be a more powerful, and therefore useful, statistic.


A far more extreme case of a biased estimator being better than any unbiased estimator is well-known: Suppose X has a Poisson distribution with expectation λ. It is desired to estimate In probability theory and statistics, the Poisson distribution is a discrete probability distribution. ...

The only function of the data constituting an unbiased estimator is

If the observed value of X is 100, then the estimate is 1, although the true value of the quantity being estimated is obviously very likely to be near 0, which is the opposite extreme. And if X is observed to be 101, then the estimate is even more absurd: it is −1, although the quantity being estimated obviously must be positive. The (biased) maximum-likelihood estimator Maximum likelihood estimation (MLE) is a popular statistical method used to make inferences about parameters of the underlying probability distribution of a given data set. ...

is better than this unbiased estimator in the sense that the mean squared error In statistics the mean squared error of an estimator T of an unobservable parameter θ is i. ...

is smaller. Compare the unbiased estimator's MSE of

1 − e − 4λ

The MSE is a function of the true value λ. The bias of the maximum-likelihood estimator is:

.

The bias of maximum-likelihood estimators can be substantial. Consider a case where n tickets numbered from 1 through to n are placed in a box and one is selected at random, giving a value X. If n is unknown, then the maximum-likelihood estimator of n is X, even though the expectation of X is only (n+1)/2; we can only be certain that n is at least X and is probably more. In this case, the natural unbiased estimator is 2X − 1.


See also

Omitted-variable bias is the bias that appears in an estimate of a parameter if a regression run does not have the appropriate form and data for other parameters. ... Cognitive bias is any of a wide range of observer effects identified in cognitive science and social psychology including very basic statistical, social attribution, and memory errors that are common to all human beings. ...

External link

  • An Illuminating Counterexample


 
 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m