FACTOID # 147: Train spotters should go to Australia, which has more railway per capita than anywhere else on the globe.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Ancillary statistic

In statistics, an ancillary statistic is a statistic whose probability distribution does not depend on which of the probability distributions among those being considered is the distribution of the statistical population from which the data were taken. This concept was introduced by the great statistical geneticist Sir Ronald Fisher.


Examples

  • Suppose X1, ..., Xn are independent and identically distributed, and are normally distributed with expected value μ and variance 1. (The use as an example, of this particular parametrized family of probability distributions, all having the same variance, is unrealistic, in that it amounts to a situation in which the statistician somehow knows the exact value of the population variance, but can only estimate the population mean by using the observed values of the data X1, ..., Xn.) Let
 overline{X}_n=(X_1+ , cdots ,+X_n)/n
be the sample mean. The random variable
 overline{X}-n_ mu
is not an ancillary statistic, even though its probability distribution does not depend on μ That is because it is not a statistic, since its value depends on the unobservable population mean μ
The random variable
 max { ,X-1, dots,X-n , }_ min { ,X-1, dots,X-n , }
is an ancillary statistic, because
  • Its probability distribution does not change as μ changes, and
  • it depends only on the data X1, ..., Xn and not on the unobservable parameter μ, i.e., it is a statistic.
  • In baseball, suppose a scout observes a batter in N at-bats. Suppose (unrealistically) that the number N is chosen by some random process that is independent of the batter's ability -- say a coin is tossed after each at-bat and the result determines whether the scout will stay to watch the batter's next at-bat. The eventual data are the number N of at-bats and the number X of hits. The observed batting average X/N fails to convey all of the information available in the data because it fails to report the number N of at-bats (e.g., a batting average of 0.400, which is very high, based on only five at-bats does not inspire anywhere near as much confidence in the player's ability than a 0.400 average based on 100 at-bats). The number N of at-bats is an ancillary statistic because
  • It is a part of the observable data (it is a statistic), and
  • Its probability distribution does not depend on the batter's ability, since it was chosen by a random process independent of the batter's ability.
This ancillary statistic is an ancillary complement to the observed batting average X/N, i.e., the batting average X/N is not a sufficient statistic, in that it conveys less than all of the relevant information in the data, but conjoined with N, it becomes sufficient.



  Results from FactBites:
 
Ancillary statistic - Wikipédia (279 words)
and not on the unobservable parameter μ, i.e., it is a statistic.
Its probability distribution does not depend on the batter's ability, since it was chosen by a random process independent of the batter's ability.
This ancillary statistic is an ancillary complement to the observed batting average X/N, i.e., the batting average X/N is not a sufficient statistic, in that it conveys less than all of the relevant information in the data, but conjoined with N, it becomes sufficient.
Encyclopedia: Ron Fisher (3191 words)
In statistics, one often considers a family of probability distributions for a random variable X (and X is often a vector whose components are scalar-valued random variables, frequently independent) parameterized by a scalar- or vector-valued parameter, which let us call θ.
In statistics, the Fisher information I(θ), thought of as the amount of information that an observable random variable carries about an unobservable parameter θ upon which the probability distribution of X depends, is the variance of the score.
Statistics is the science and practice of developing knowledge through the use of empirical data expressed in quantitative form.
  More results at FactBites »

 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your location
Your comments
Please enter the 5-letter protection code


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.