FACTOID # 120: Nepal’s flag isn’t square or rectangular. It’s a double triangle.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Chebyshev's inequality

In probability theory, Chebyshev's inequality (also known as Tchebysheff's inequality, Chebyshev's theorem, or the Bienaymé-Chebyshev inequality), named after Pafnuty Chebyshev, who first proved it, states that in any data sample or probability distribution, nearly all the values are close to the mean value, and provides a quantitative description of "nearly all" and "close to". For example, no more than 1/4 of the values are more than 2 standard deviations away from the mean, no more than 1/9 are more than 3 standard deviations away, no more than 1/25 are more than 5 standard deviations away, and so on. Probability theory is the mathematical study of phenomena characterized by randomness or uncertainty. ... Pafnuty Lvovich Chebyshev Pafnuty Lvovich Chebyshev (Russian: ) ( May 16 [O.S. May 4] 1821 – December 8 [O.S. November 26] 1894) was a Russian mathematician. ... In mathematics and statistics, a probability distribution, more properly called a probability density, assigns to every interval of the real numbers a probability, so that the probability axioms are satisfied. ... In mathematics, there are numerous methods for calculating the average or central tendency of a list of n numbers. ... In probability and statistics, the standard deviation of a probability distribution, random variable, or population or multiset of values is defined as the square root of the variance. ...

Contents

General statement

The inequality can be stated quite generally using measure theory; the statement in the language of probability theory then follows as a particular case, for a space of measure 1. In mathematics, a measure is a function that assigns a number, e. ...


Measure-theoretic statement

Let (X,Σ,μ) be a measure space, and let f be an extended real-valued measurable function defined on X. Then for any real number t > 0, In mathematics, a measure is a function that assigns a number, e. ... The extended real number line is obtained from the real number line R by adding two elements: +∞ and −∞ (which are not considered to be real numbers). ... In mathematics, measurable functions are well-behaved functions between measurable spaces. ...

mu({xin X,:,,|f(x)|geq t}) leq {1over t^2} int_X f^2 , dmu.

More generally, if g is a nonnegative extended real-valued measurable function, nondecreasing on the range of f, then

mu({xin X,:,,f(x)geq t}) leq {1over g(t)} int_X gcirc f, dmu.

The previous statement then follows by defining g(t) as

g(t)=begin{cases}t^2&mbox{if }tgeq00&mbox{otherwise,}end{cases}

and taking |f| instead of f.


Probabilistic statement

Let X be a random variable with expected value μ and finite variance σ2. Then for any real number k > 0, A random variable is a mathematical function that maps outcomes of random experiments to numbers. ... In probability theory the expected value (or mathematical expectation) of a random variable is the sum of the probability of each possible outcome of the experiment multiplied by its payoff (value). Thus, it represents the average amount one expects as the outcome of the random trial when identical odds are... In probability theory and statistics, the variance of a random variable (or equivalently, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ... In mathematics, the real numbers may be described informally in several different ways. ...

Pr(left|X-muright|geq ksigma)leqfrac{1}{k^2}.

Only the cases k > 1 provide useful information.


As an example, using k=√2 shows that at least half of the values lie in the interval (μ − √2 σ, μ + √2 σ).


Typically, the theorem will provide rather loose bounds. However, the bounds provided by Chebyshev's inequality cannot, in general (remaining sound for variables of arbitrary distribution), be improved upon. For example, for any k > 1, the following example (where σ = 1/k) meets the bounds exactly.

The theorem can be useful despite loose bounds because it applies to random variables of any distribution, and because these bounds can be calculated knowing no more about the distribution than the mean and variance.


Chebyshev's inequality is used for proving the weak law of large numbers. The law of large numbers is a fundamental concept in statistics and probability that describes how the average of a randomly selected sample from a large population is likely to be close to the average of the whole population. ...


Example application

For illustration, assume we have a large body of text, for example articles from a publication. Assume we know that the articles are on average 1000 characters long with a standard deviation of 200 characters. From Chebyshev's inequality we can then deduce that at least 75% of the articles have a length between 600 and 1400 characters (k = 2). In probability and statistics, the standard deviation of a probability distribution, random variable, or population or multiset of values is defined as the square root of the variance. ...


Variants

A one-tailed variant with k > 0, is

Pr(X-mu geq ksigma)leqfrac{1}{1+k^2}.

A stronger result applicable to unimodal probability distributions is the Vysochanskiï-Petunin inequality. In mathematics, a function f(x) between two ordered sets is unimodal if for some value m (the mode), it is monotonically increasing for x ≤ m and monotonically decreasing for x ≥ m. ... In probability theory, the Vysochanskiï-Petunin inequality gives a lower bound for the probability that a random variable with finite variance lies within a certain number of standard deviations of the variables mean. ...


Distribution for which equality holds

For the discrete distribution with point masses at −1 and +1, each with weight 1/(2k2), and a point mass at 0 with weight 1 − 1/k2, equality holds exactly. The standard deviation of this distribution is 1/k, and for this distribution,

Pr(|Xμ| ≥ ) = 1/k2.

Proof

Measure-theoretic proof

Let At be defined as At = {xX | f(x) ≥ t}, and let

1_{A_t}

be the indicator function of the set At. Then, it is easy to check that In the mathematical subfield of set theory, the indicator function, or characteristic function, is a function defined on a set X which is used to indicate membership of an element in a subset A of X. Remark. ...

0leq g(t)1_{A_t}leq gcirc f,1_{A_t}leq gcirc f,

and therefore,

g(t)mu(A_t)=int_X g(t)1_{A_t},dmuleqint_{A_t} gcirc f,dmuleqint_X gcirc f,dmu.

The desired inequality follows from dividing the above inequality by g(t).


Probabilistic proof

Markov's inequality states that for any real-valued random variable Y and any positive number a, we have Pr(|Y| > a) ≤ E(|Y|)/a. One way to prove Chebyshev's inequality is to apply Markov's inequality to the random variable Y = (X − μ)2 with a = (σk)2. In probability theory, Markovs inequality gives an upper bound for the probability that a non-negative function of a random variable is greater than or equal to some positive constant. ...


It can also be proved directly. For any event A, let IA be the indicator random variable of A, i.e. IA equals 1 if A occurs and 0 otherwise. Then

Pr(|X-mu| geq ksigma) = operatorname{E}(I_{|X-mu| geq ksigma}) = operatorname{E}(I_{[(X-mu)/(ksigma)]^2 geq 1})
leq operatorname{E}left( left( {X-mu over ksigma} right)^2 right) = {1 over k^2} {operatorname{E}((X-mu)^2) over sigma^2} = {1 over k^2}.

The direct proof shows why the bounds are quite loose in typical cases: the number 1 to the left of "≥" is replaced by [(X − μ)/(kσ)]2 to the right of "≥" whenever the latter exceeds 1. In some cases it exceeds 1 by a very wide margin.


See also


  Results from FactBites:
 
NationMaster - Encyclopedia: Pafnuty Chebyshev (2882 words)
One of nine children, Chebyshev was born in the village of Okatovo in the district of Borovsk, province of Kaluga.
Chebyshev's inequality is used to prove the weak law of large numbers.
Chebyshev is a large lunar crater that lies in the southern hemisphere on the far side of the Moon.
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.