FACTOID # 27: Want your kids to stay in school? Send them to Norway.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

FACTS & STATISTICS    Simple view

  1. Select countries to view: (hold down Control key and click to select several)

     

     

    Compare:

     

     

  1. Select fact or statistic: (* = graphable)

     

     

     

  2. (OPTIONAL) Compare to statistic: (both need to be graphable)

     

     

     

  3. View result as:

     

       
(OR) SEARCH ALL encyclopedia, stats & forums:   

Encyclopedia > Empirical Bayes

In statistics, empirical Bayes methods involve: A graph of a bell curve in a normal distribution showing statistics used in educational assessment, comparing various grading methods. ...

  • An "underlying" probability distribution of some unobservable quantity assigned to each member of a statistical population. This quantity is a random variable if a member of the population is chosen at random. The probability distribution of this random variable is not known, and is thought of as a property of the population.
  • An observable quantity assigned to each member of the population. When a random sample is taken from the population, it is desired first to estimate the "underlying" probability distribution, and then to estimate the value of the unobservable quantity assigned to each member of the sample.

This is probably incomprehensible without concrete examples. In mathematics, a probability distribution assigns to every interval of the real numbers a probability, so that the probability axioms are satisfied. ... In statistics, a statistical population is a set of entities concerning which statistical inferences are to be drawn, often based on a random sample taken from the population. ... A random variable is a term used in mathematics and statistics. ...

Contents


Examples

The original example, introduced by Herbert Robbins in 1956

Each customer of an insurance company has an "accident rate" Θ and is insured against "accidents"; the probability distribution of Θ is the "underlying" distribution, and is unknown. The number of "accidents" suffered by each customer in a specified baseline time period has a Poisson distribution whose expected value is the particular customer's "accident rate". That number of "accidents" is the observable quantity. A crude way to estimate the underlying probability distribution of the "accident rate" Θ is to estimate the proportion of members of the whole population suffering 0, 1, 2, 3, ... accidents during the specified time period to be equal to the corresponding proportion in the observed random sample. Having done so, it is then desired the "accident rate" of each customer in the sample. One may use the conditional expected value of the "accident rate" Θ given the observed number X of "accidents" during the baseline period. Given the assumed Poisson distribution of accidents, one can show that Herbert Ellis Robbins (1922 - 2001) was a mathematician and statistician who did research in topology, measure theory, statistics, and a variety of other fields. ... In probability theory and statistics, the Poisson distribution is a discrete probability distribution. ... This article defines some terms which characterize probability distributions of two or more variables. ...

The quantities P(X = x + 1) and P(X = x) must be estimated based on the sample. That is why the word empirical appears in the name of this concept. The conditional expected value of Θ given the observed value X = x is found by using Bayes' theorem. That is why the word Bayes appears. Bayes theorem is a result in probability theory, which relates the conditional and marginal probability distributions of random variables. ...


Thus, if a customer suffers six "accidents" during the baseline period, that customer's estimated "accident rate" is 7 × [the proportion of the sample who suffered 7 "accidents"] / [the proportion of the sample who suffered 6 "accidents"].


Proof of the identity labeled with the asterisk above:


Denote the probability density function of the underlying "accident rate" Θ by fΘ(θ) (as is often done in probability theory, we use capital letters for random variables and corresponding lower-case letters for the dummy variables in the density or mass functions). Denote the conditional probability mass function of the number of accidents suffered by a randomly chosen customer during the baseline period, given that that customer's "accident rate" is θ, by . This conditional distribution was assumed to be a Poisson distribution; therefore we have In mathematics, a probability density function (pdf) serves to represent a probability distribution in terms of integrals. ... In probability theory, a probability mass function (abbreviated pmf) gives the probability that a discrete random variable is exactly equal to some value. ...

By Bayes' theorem, the conditional probability density function of θ given the event X = x is Bayes theorem is a result in probability theory, which relates the conditional and marginal probability distributions of random variables. ...

The normalizing constant by which we divide is the integral with respect to θ from 0 to ∞ of the function in the numerator. Consequently, the conditional expected value of Θ given the observed number x of accidents is The concept of a normalizing constant arises in probability theory and a variety of other areas of mathematics. ...

By the law of total probability (and the routine cancellation of factorials), this is equal to Nomenclature in probability theory is not wholly standard. ...

An example involving the normal distribution

Suppose the weights of a large population of 35-year-old men are normally distributed with expected value μ and standard deviation σ. A crude measuring instrument measures a man's weight with a measurement error that is normally distributed with expected value 0 and standard deviation τ. The man's true weight is not observable; his weight measured with error is observed. The conditional probability distribution of a randomly chosen man's true weight, given his weight-measured-with-error, can be found by using Bayes' theorem, and then the conditional expected value can be used as an estimate of his true weight, provided that the values of μ, σ, and τ are known. But they are not. One may use the data to estimate the standard deviation of the measurement errors by measuring each man multiple times. One may similarly estimate the population average weight and the population standard deviation of weights by weighing multiple men. These estimates of parameters based on the data are the occasion for the use of the word empirical. Finally, one may then estimate the aforementioned conditional expected true weight by using Bayes' theorem. The normal distribution, also called Gaussian distribution, is an extremely important probability distribution in many fields. ... Bayes theorem is a result in probability theory, which relates the conditional and marginal probability distributions of random variables. ...

Mathematical details are still to be added here.

References

  • Herbert Robbins, An Empirical Bayes Approach to Statistics, Proceeding of the Third Berkeley Symposium on Mathematical Statistics, volume 1, pages 157-163, University of California Press, Berkeley, 1956.

External links

  • Use of Empirical Bayes Method in estimating road safety (North America)


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.