FACTOID # 87: On average, more than 70 persons die of varicose veins per year per country.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS   

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Student's t distribution

Student's t
Probability density function
Cumulative distribution function
Parameters nu > 0! deg. of freedom (real)
Support x in (-infty; +infty)!
Probability density function (pdf) frac{Gamma((nu+1)/2)} {sqrt{nupi},Gamma(nu/2)} (1+x^2/nu)^{-(nu+1)/2}!
Cumulative distribution function (cdf) frac{1}{2} + frac{x Gamma left( (nu+1)/2 right) ,_2F_1 left ( frac{1}{2},(nu+1)/2;frac{3}{2};-frac{x^2}{nu} right)} {sqrt{pinu},Gamma (nu/2)} where ,_2F_1 is the hypergeometric function
Mean 0 for ν > 1, otherwise undefined
Median 0
Mode 0
Variance frac{nu}{nu-2}! for ν > 2, otherwise undefined
Skewness 0 for ν > 3
Excess kurtosis frac{3nu-6}{nu-4}! for ν > 4,
Entropy begin{matrix} frac{nu+1}{2}left[ psi(frac{1+nu}{2}) - psi(frac{nu}{2}) right] [0.5em] + log{left[sqrt{nu}B(frac{nu}{2},frac{1}{2})right]} end{matrix}
Moment-generating function (mgf) (Not defined)
Characteristic function

In probability and statistics, the t-distribution or Student's t-distribution is a probability distribution that arises in the problem of estimating the mean of a normally distributed population when the sample size is small. It is the basis of the popular Student's t-tests for the statistical significance of the difference between two sample means, and for confidence intervals for the difference between two population means. The Student's t-distribution is a special case of the generalised hyperbolic distribution. ImageMetadata File history File links Download high resolution version (918x567, 47 KB) Summary eprsonal production Free use and distribution Licensing Cleanup This image needs to be cleaned up to conform to a higher standard of quality. ... Image File history File links Download high resolution version (1200x1200, 18 KB) Summary The CDF of the t-distribution bitmap(file=t_distributionCDF.png,type=png256,width=4,height=4,res=300,pointsize=12) par(mar=c(3,3,1,1)) x <- seq(-5,5,len=1000) plot(range(x),c... Please refer to Real vs. ... In mathematics, the support of a real-valued function f on a set X is sometimes defined as the subset of X on which f is nonzero. ... In mathematics, a probability density function (pdf) is a function that represents a probability distribution in terms of integrals. ... In probability theory, the cumulative distribution function (abbreviated cdf) completely describes the probability distribution of a real-valued random variable, X. For every real number x, the cdf is given by where the right-hand side represents the probability that the random variable X takes on a value less than... In mathematics, a hypergeometric series is a power series in which the ratios of successive coefficients k is a rational function of k. ... In probability theory the expected value (or mathematical expectation) of a random variable is the sum of the probability of each possible outcome of the experiment multiplied by its payoff (value). Thus, it represents the average amount one expects as the outcome of the random trial when identical odds are... In probability theory and statistics, a median is a type of average that is described as the number dividing the higher half of a sample, a population, or a probability distribution, from the lower half. ... In statistics, mode means the most frequent value assumed by a random variable, or occurring in a sampling of a random variable. ... In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ... Example of the experimental data with non-zero skewness (gravitropic response of wheat coleoptiles, 1,790) In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable. ... The far red light has no effect on the average speed of the gravitropic reaction in wheat coleoptiles, but it changes kurtosis from platykurtic to leptokurtic (-0. ... Claude Shannon In information theory, the Shannon entropy or information entropy is a measure of the uncertainty associated with a random variable. ... In mathematics, the digamma function is defined as the logarithmic derivative of the gamma function: It is the first of the polygamma functions. ... In theoretical physics, specifically quantum field theory, a beta-function β(g) encodes the dependence of a coupling parameter, g, on the energy scale, of a given physical process. ... In probability theory and statistics, the moment-generating function of a random variable X is wherever this expectation exists. ... In probability theory, the characteristic function of any random variable completely defines its probability distribution. ... Probability is the likelihood that something is the case or will happen. ... A graph of a normal bell curve showing statistics used in educational assessment and comparing various grading methods. ... In mathematics and statistics, a probability distribution is a function of the probabilities of a mutually exclusive and exhaustive set of events. ... In probability theory the expected value (or mathematical expectation) of a random variable is the sum of the probability of each possible outcome of the experiment multiplied by its payoff (value). Thus, it represents the average amount one expects as the outcome of the random trial when identical odds are... The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields. ... Sample size, usually designated N, is the number of repeated measurements in a statistical sample. ... A t test is any statistical hypothesis test in which the test statistic has a Students t distribution if the null hypothesis is true. ... In statistics, a result is significant if it is unlikely to have occurred by chance, given that a presumed null hypothesis is true. ... In statistics, mean has two related meanings: the arithmetic mean (and is distinguished from the geometric mean or harmonic mean). ... In this diagram, the bars represent observation means and the red lines represent the confidence intervals surrounding them. ... The generalised hyperbolic distribution is a continuous probability distribution defined by the probability density function where is the modified Bessel function of the second kind. ...


The derivation of the t-distribution was first published in 1908 by William Sealy Gosset, while he worked at a Guinness Brewery in Dublin. He was not allowed to publish under his own name, so the paper was written under the pseudonym Student. The t-test and the associated theory became well-known through the work of R.A. Fisher, who called the distribution "Student's distribution". William Sealy Gosset, 1876-1937 William Sealy Gosset (June 13, 1876–October 16, 1937) was an English chemist and statistician, best known by his pen name Student and for his work on Students t-distribution. ... St. ... Dublin city centre at night WGS-84 (GPS) Coordinates: , Statistics Province: Leinster County: Dáil Éireann: Dublin Central, Dublin North Central, Dublin North East, Dublin North West, Dublin South Central, Dublin South East European Parliament: Dublin Dialling Code: +353 1 Postal District(s): D1-24, D6W Area: 114. ... A t test is any statistical hypothesis test in which the test statistic has a Students t distribution if the null hypothesis is true. ... Sir Ronald Fisher Sir Ronald Aylmer Fisher, FRS (February 17, 1890&#8211;July 29, 1962) was an extraordinarily talented evolutionary biologist, geneticist and statistician. ...


Student's distribution arises when (as in nearly all practical statistical work) the population standard deviation is unknown and has to be estimated from the data. Textbook problems treating the standard deviation as if it were known are of two kinds: (1) those in which the sample size is so large that one may treat a data-based estimate of the variance as if it were certain, and (2) those that illustrate mathematical reasoning, in which the problem of estimating the standard deviation is temporarily ignored because that is not the point that the author or instructor is then explaining. In probability and statistics, the standard deviation of a probability distribution, random variable, or population or multiset of values is a measure of the spread of its values. ... In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...

Contents

Why Student's t-distribution

Confidence intervals and hypothesis tests rely on Student's t-distribution to cope with uncertainty resulting from estimating the standard deviation from a sample, whereas if the population standard deviation were known, a normal distribution would be used. In this diagram, the bars represent observation means and the red lines represent the confidence intervals surrounding them. ... One may be faced with the problem of making a definite decision with respect to an uncertain hypothesis which is known only through its observable consequences. ... The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields. ...


How Student's t-distribution comes about

Suppose X1, ..., Xn are independent random variables that are normally distributed with expected value μ and variance σ2. Let In probability theory, a random variable is a quantity whose values are random and to which a probability distribution is assigned. ... In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...

 overline{X}_n = (X_1+cdots+X_n)/n

be the sample mean, and

{S_n}^2=frac{1}{n-1}sum_{i=1}^nleft(X_i-overline{X}_nright)^2

be the sample variance. It is readily shown that the quantity

Z=frac{overline{X}_n-mu}{sigma/sqrt{n}}

is normally distributed with mean 0 and variance 1, since the sample mean scriptstyle overline{X}_n is normally distributed with mean μ and standard deviation scriptstylesigma/sqrt{n}.


Gosset studied a related quantity,

T=frac{overline{X}_n-mu}{S_n / sqrt{n}}.

While similar to Z, the variance scriptstyle S_n^2 is estimated. Thus scriptstyle S_n^2 has a scriptstylechi_n^2 distribution (which introduces uncertainty into the denominator). Gosset's work showed that T has the probability density function In mathematics, a probability density function (pdf) is a function that represents a probability distribution in terms of integrals. ...

f(t) = frac{Gamma((nu+1)/2)}{sqrt{nupi,},Gamma(nu/2)} (1+t^2/nu)^{-(nu+1)/2}

with ν equal to n − 1.


The distribution of T is now called the t-distribution. The parameter ν is called the number of degrees of freedom. The distribution depends on ν, but not μ or σ; the lack of dependence on μ and σ is what makes the t-distribution important in both theory and practice. Γ is the Gamma function. This article or section is in need of attention from an expert on the subject. ... The Gamma function along part of the real axis In mathematics, the Gamma function is an extension of the factorial function to complex numbers. ...


The moments of the t-distribution are

E(T^k)=begin{cases} 0 & mbox{k odd},quad 0<k< nu frac{Gamma(frac{k+1}{2})Gamma(frac{nu-k}{2})nu^{k/2}}{sqrt{pi}Gamma(frac{nu}{2})} & mbox{k even}, quad 0<k< nu mbox{NaN} & mbox{k odd},quad 0<nuleq k infty & mbox{k even},quad 0<nuleq k end{cases}

It should be noted that the term for 0 < k < ν, k even, may be simplified using the properties of the Gamma function to The Gamma function along part of the real axis In mathematics, the Gamma function is an extension of the factorial function to complex numbers. ...

E(T^k)= prod_{i=1}^{k/2} frac{2i-1}{nu - 2i}nu^{k/2} qquad kmbox{ even},quad 0<k<nu.

Confidence intervals derived from Student's t-distribution

Suppose the number A is so chosen that

Pr(-A < T < A)=0.9,,

when T has a t-distribution with n − 1 degrees of freedom. This is the same as

Pr(T < A) = 0.95,,

so A is the "95th percentile" of this probability distribution, or A = t(0.05,n − 1). Then

Pr left (-A < {overline{X}_n - mu over S_n/sqrt{n}} < A right)=0.9,

and this is equivalent to

Prleft(overline{X}_n - A{S_n over sqrt{n}} < mu < overline{X}_n + A{S_n over sqrt{n}}right) = 0.9.

Therefore the interval whose endpoints are

overline{X}_npm Afrac{S_n}{sqrt{n}}

is a 90-percent confidence interval for μ. Therefore, if we find the mean of a set of observations that we can reasonably expect to have a normal distribution, we can use the t-distribution to examine whether the confidence limits on that mean include some theoretically predicted value - such as the value predicted on a null hypothesis. In this diagram, the bars represent observation means and the red lines represent the confidence intervals surrounding them. ... In statistics, a null hypothesis is a hypothesis set up to be nullified or refuted in order to support an alternative hypothesis. ...


It is this result that is used in the Student's t-tests: since the difference between the means of samples from two normal distributions is itself distributed normally, the t-distribution can be used to examine whether that difference can reasonably be supposed to be zero. A t test is any statistical hypothesis test in which the test statistic has a Students t distribution if the null hypothesis is true. ...


If the data are normally distributed, the one-sided (1 − a)-upper confidence limit (UCL) of the mean, can be calculated using the following equation:

mathrm{UCL}_{1-a} = overline{X}_n+frac{t_{a,n-1} S}{sqrt{n}}.

The resulting UCL will be the greatest average value that will occur for a given confidence interval and population size. In other words, overline{X}_n being the mean of the set of observations, the probability that the mean of the distribution is inferior to UCL1 − a is equal to the confidence level 1 − a.


A number of other statistics can be shown to have t-distributions for samples of moderate size under null hypotheses that are of interest, so that the t-distribution forms the basis for significance tests in other situations as well as when examining the differences between means. For example, the distribution of Spearman's rank correlation coefficient, rho, in the null case (zero correlation) is well approximated by the t distribution for sample sizes above about 20. In statistics, a null hypothesis is a hypothesis set up to be nullified or refuted in order to support an alternative hypothesis. ... In statistics, Spearmans rank correlation coefficient, named after Charles Spearman and often denoted by the Greek letter ρ (rho), is a non-parametric measure of correlation – that is, it assesses how well an arbitrary monotonic function could describe the relationship between two variables, without making any assumptions about the frequency...


See prediction interval for another example of the use of this distribution. In statistics, a prediction interval bears the same relationship to a future observation that a confidence interval bears to an unobservable population parameter. ...


Student's distribution probability function and p-value

The distribution A(t | ν) is used when testing whether a t-statistic that measures the difference between two means is statistically significant. This is used in a variety of situations, particularly in t-tests. For the statistic t, with ν degrees of freedom, A(t | ν) is the probability that t would be less than the observed value if the two means were the same. It is given by the following formula: A t-test is any statistical hypothesis test in which the test statistic has a Students t-distribution if the null hypothesis is true. ...

A(t|nu) = frac{1}{sqrt{nu} Bleft (frac{1}{2}, frac{nu}{2}right )} intlimits_{-t}^{t} left (1+frac{x^2}{nu}right )^{-frac{nu +1}{2} }, dx

where B is the Beta function. There is a relation to the incomplete beta function Ix(a,b) as follows: In theoretical physics, specifically quantum field theory, a beta-function β(g) encodes the dependence of a coupling parameter, g, on the energy scale, of a given physical process. ... In theoretical physics, specifically quantum field theory, a beta-function β(g) encodes the dependence of a coupling parameter, g, on the energy scale, of a given physical process. ...

A(t|nu) = 1 - I_{frac{nu}{nu +t^2}}left(frac{nu}{2},frac{1}{2}right).

The exact p-value is given by In statistical hypothesis testing, the p-value of a random variable T used as a test statistic is the probability that T will assume a value at least as extreme as the observed value tobserved, given that a null hypothesis being considered is true. ...

p=frac{1-A(t|nu)}{2}.

Further theory

Gosset's result can be stated more generally. (See, for example, Hogg and Craig, Sections 4.4 and 4.8.) Let Z have a normal distribution with mean 0 and variance 1. Let V have a chi-square distribution with ν degrees of freedom. Further suppose that Z and V are independent (see Cochran's theorem). Then the ratio The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields. ... In probability theory and statistics, the chi-square distribution (also chi-squared or χ2  distribution) is one of the theoretical probability distributions most widely used in inferential statistics, i. ... In statistics, Cochrans theorem is used in the analysis of variance. ...

 frac{Z}{sqrt{V/nu }}

has a t-distribution with ν degrees of freedom.


For a t-distribution with ν degrees of freedom, the expected value is 0, and its variance is ν/(ν − 2) if ν > 2. The skewness is 0 and the kurtosis is 6/(ν − 4) if ν > 4. In probability theory the expected value (or mathematical expectation) of a random variable is the sum of the probability of each possible outcome of the experiment multiplied by its payoff (value). Thus, it represents the average amount one expects as the outcome of the random trial when identical odds are... In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ... Example of the experimental data with non-zero skewness (gravitropic response of wheat coleoptiles, 1,790) In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable. ... The far red light has no effect on the average speed of the gravitropic reaction in wheat coleoptiles, but it changes kurtosis from platykurtic to leptokurtic (-0. ...


The cumulative distribution function is given by an incomplete beta function, In probability theory, the cumulative distribution function (abbreviated cdf) completely describes the probability distribution of a real-valued random variable, X. For every real number x, the cdf is given by where the right-hand side represents the probability that the random variable X takes on a value less than... In mathematics, the incomplete beta function is a generalization of the beta function that replaces the definite integral of the beta function with an indefinite integral. ...

int_{-infty}^t f(u),du = I_x(nu/2,nu/2)

with

x = frac{t+sqrt{t^2+nu}}{2sqrt{t^2+nu}}.

The t-distribution is related to the F-distribution as follows: the square of a value of t with ν degrees of freedom is distributed as F with 1 and ν degrees of freedom. In statistics and probability, the F-distribution is a continuous probability distribution. ...


The overall shape of the probability density function of the t-distribution resembles the bell shape of a normally distributed variable with mean 0 and variance 1, except that it is a bit lower and wider. As the number of degrees of freedom grows, the t-distribution approaches the normal distribution with mean 0 and variance 1. The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields. ...


The following images show the density of the t-distribution for increasing values of ν. The normal distribution is shown as a blue line for comparison.; Note that the t-distribution (red line) becomes closer to the normal distribution as ν increases. For ν = 30 the t-distribution is almost the same as the normal distribution.

Density of the t-distribution (red and green) for 1, 2, 3, 5, 10, and 30 df compared to normal distribution (blue)

plot of t-distribution, 1 df File links The following pages link to this file: Students t-distribution Categories: GFDL images | Probability distributions images ... plot of t-distribution, 2 df File links The following pages link to this file: Students t-distribution Categories: GFDL images | Probability distributions images ... plot of t-distribution, 3 df File links The following pages link to this file: Students t-distribution Categories: GFDL images | Probability distributions images ... plot of t-distribution, 5 df File links The following pages link to this file: Students t-distribution Categories: GFDL images | Probability distributions images ... plot of t-distribution, 10 df File links The following pages link to this file: Students t-distribution Categories: GFDL images | Probability distributions images ... plot of t-distribution, 30 df File links The following pages link to this file: Students t-distribution Categories: GFDL images | Probability distributions images ...

Table of selected values

The following table lists a few selected values for t-distributions with ν degrees of freedom for the 90%, 95%, 97.5%, and 99.5% one-sided confidence intervals. For an example of how to read this table, take the fourth row, which begins with 4; that means ν, the number of degrees of freedom, is 4 (and if we are dealing, as above, with n values with a fixed sum, n = 5). Take the fourth entry, in the column headed 90%. The value of that entry is "1.533". Then the probability that T is less than 1.533 is 90% or Pr(-∞ <T < 1.533) = 0.9; the entry does not mean (as it might with other distributions) Pr(−1.533 < T < 1.533) = 0.9.


In fact, by the symmetry of the distribution,

Pr(T < −1.533) =1 − Pr(T > 1.533) = 1 − 0.9 = 0.1,

and so

Pr(−1.533 < T < 1.533) = 1 − 2(0.1) = 0.8.

Note that the last row also gives critical points: a t-distribution with infinitely-many degrees of freedom is a normal distribution. (See below: Related distributions).

ν 75% 80% 85% 90% 95% 97.5% 99% 99.5% 99.75% 99.9% 99.95%
1 1.000 1.376 1.963 3.078 6.314 12.71 31.82 63.66 127.3 318.3 636.6
2 0.816 1.061 1.386 1.886 2.920 4.303 6.965 9.925 14.09 22.33 31.60
3 0.765 0.978 1.250 1.638 2.353 3.182 4.541 5.841 7.453 10.21 12.92
4 0.741 0.941 1.190 1.533 2.132 2.776 3.747 4.604 5.598 7.173 8.610
5 0.727 0.920 1.156 1.476 2.015 2.571 3.365 4.032 4.773 5.893 6.869
6 0.718 0.906 1.134 1.440 1.943 2.447 3.143 3.707 4.317 5.208 5.959
7 0.711 0.896 1.119 1.415 1.895 2.365 2.998 3.499 4.029 4.785 5.408
8 0.706 0.889 1.108 1.397 1.860 2.306 2.896 3.355 3.833 4.501 5.041
9 0.703 0.883 1.100 1.383 1.833 2.262 2.821 3.250 3.690 4.297 4.781
10 0.700 0.879 1.093 1.372 1.812 2.228 2.764 3.169 3.581 4.144 4.587
11 0.697 0.876 1.088 1.363 1.796 2.201 2.718 3.106 3.497 4.025 4.437
12 0.695 0.873 1.083 1.356 1.782 2.179 2.681 3.055 3.428 3.930 4.318
13 0.694 0.870 1.079 1.350 1.771 2.160 2.650 3.012 3.372 3.852 4.221
14 0.692 0.868 1.076 1.345 1.761 2.145 2.624 2.977 3.326 3.787 4.140
15 0.691 0.866 1.074 1.341 1.753 2.131 2.602 2.947 3.286 3.733 4.073
16 0.690 0.865 1.071 1.337 1.746 2.120 2.583 2.921 3.252 3.686 4.015
17 0.689 0.863 1.069 1.333 1.740 2.110 2.567 2.898 3.222 3.646 3.965
18 0.688 0.862 1.067 1.330 1.734 2.101 2.552 2.878 3.197 3.610 3.922
19 0.688 0.861 1.066 1.328 1.729 2.093 2.539 2.861 3.174 3.579 3.883
20 0.687 0.860 1.064 1.325 1.725 2.086 2.528 2.845 3.153 3.552 3.850
21 0.686 0.859 1.063 1.323 1.721 2.080 2.518 2.831 3.135 3.527 3.819
22 0.686 0.858 1.061 1.321 1.717 2.074 2.508 2.819 3.119 3.505 3.792
23 0.685 0.858 1.060 1.319 1.714 2.069 2.500 2.807 3.104 3.485 3.767
24 0.685 0.857 1.059 1.318 1.711 2.064 2.492 2.797 3.091 3.467 3.745
25 0.684 0.856 1.058 1.316 1.708 2.060 2.485 2.787 3.078 3.450 3.725
26 0.684 0.856 1.058 1.315 1.706 2.056 2.479 2.779 3.067 3.435 3.707
27 0.684 0.855 1.057 1.314 1.703 2.052 2.473 2.771 3.057 3.421 3.690
28 0.683 0.855 1.056 1.313 1.701 2.048 2.467 2.763 3.047 3.408 3.674
29 0.683 0.854 1.055 1.311 1.699 2.045 2.462 2.756 3.038 3.396 3.659
30 0.683 0.854 1.055 1.310 1.697 2.042 2.457 2.750 3.030 3.385 3.646
40 0.681 0.851 1.050 1.303 1.684 2.021 2.423 2.704 2.971 3.307 3.551
50 0.679 0.849 1.047 1.299 1.676 2.009 2.403 2.678 2.937 3.261 3.496
60 0.679 0.848 1.045 1.296 1.671 2.000 2.390 2.660 2.915 3.232 3.460
80 0.678 0.846 1.043 1.292 1.664 1.990 2.374 2.639 2.887 3.195 3.416
100 0.677 0.845 1.042 1.290 1.660 1.984 2.364 2.626 2.871 3.174 3.390
120 0.677 0.845 1.041 1.289 1.658 1.980 2.358 2.617 2.860 3.160 3.373
infty 0.674 0.842 1.036 1.282 1.645 1.960 2.326 2.576 2.807 3.090 3.291

The number at the beginning of each row in the table above is ν which has been defined above as n − 1. The percentage along the top is 100%(1 − α). The numbers in the main body of the table are tα,ν. If a quantity T is distributed as a Student's t distribution with ν degrees of freedom, then there is a probability 1 − α that T will be less than tα,ν.(Calculated as for a one-tailed or one-sided test as opposed to a two-tailed test.) The two-tailed test is the test of a given statistical hypothesis in which a value of the statistic that is either sufficiently small or sufficiently large will lead to rejection of the hypothesis tested. ...


For example, given a sample with a sample variance 2 and sample mean of 10, taken from a sample set of 11 (10 degrees of freedom), using the formula

overline{X}_npm Afrac{S_n}{sqrt{n}}.

We can determine that at 90% confidence, we have a true mean lying below

10+1.37218 frac{sqrt{2}}{sqrt{11}}=10.58510.

(In other words, on average, 90% of the times that an upper threshold is calculated by this method, the true mean lies below this upper threshold.) And, still at 90% confidence, we have a true mean lying over

10-1.37218 frac{sqrt{2}}{sqrt{11}}=9.41490.

(In other words, on average, 90% of the times that a lower threshold is calculated by this method, the true mean lies above this lower threshold.) So that at 80% confidence, we have a true mean lying between

10pm1.37218 frac{sqrt{2}}{sqrt{11}} = [9.41490, 10.58510].

(In other words, on average, 80% of the times that upper and lower thresholds are calculated by this method, the true mean is both below the upper threshold and above the lower threshold. This is not the same thing as saying that there is an 80% probability that the true mean lies between a particular pair of upper and lower thresholds that have been calculated by this method -- see confidence interval and prosecutor's fallacy.) In this diagram, the bars represent observation means and the red lines represent the confidence intervals surrounding them. ... To meet Wikipedias quality standards, this article or section may require cleanup. ...


For information on the inverse cumulative distribution function see Quantile function. This article or section does not cite any references or sources. ...


Special cases

Certain values of ν give an especially simple form.


ν = 1

Distribution function:

 F(x) = frac{1}{2} + frac{1}{pi}arctan(x).

Density function:

 f(x) = frac{1}{{pi}(1+x^2)}.

See Cauchy distribution The Cauchy-Lorentz distribution, named after Augustin Cauchy, is a continuous probability distribution with probability density function where x0 is the location parameter, specifying the location of the peak of the distribution, and γ is the scale parameter which specifies the half-width at half-maximum (HWHM). ...


ν = 2

Distribution function:

 F(x) = frac{1}{2}left[1+frac{x}{sqrt{2+x^2}}right].

Density function:

 f(x) = frac{1}{left(2+x^2right)^{3/2}}.

Robust parametric modelling

The t-distribution is often used as an alternative to the normal distribution as a model for data. It is frequently the case that real data have heavier tails than the normal distribution allows for. The classical approach was to identify outliers and exclude or downweight them in some way. However, it is not always easy to identify outliers (especially in high dimensions), and the t-distribution is a natural choice of model for such data and provides a parametric approach to robust statistics. Robust statistics provides an alternative approach to classical statistical methods. ...


Lange et al explored the use of the t-distribution for robust modelling of heavy tailed data in a variety of contexts. A Bayesian account can be found in Gelman et al. The degrees of freedom parameter controls the kurtosis of the distribution and is correlated with the scale parameter. The likelihood can have multiple local maxima and, as such, it is often necessary to fix the degrees of freedom at a fairly low value and estimate the other parameters taking this as given. Some authors report that values between 3 and 9 are often good choices. Venables and Ripley suggest that a value of 5 is often a good choice.


Related distributions

The scale-inverse-chi-square distribution arises in Bayesian statistics (spam filtering in particular). ... The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields. ... In statistics and probability, the F-distribution is a continuous probability distribution. ... The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields. ... The Cauchy-Lorentz distribution, named after Augustin Cauchy, is a continuous probability distribution with probability density function where x0 is the location parameter, specifying the location of the peak of the distribution, and γ is the scale parameter which specifies the half-width at half-maximum (HWHM). ...

See also

A t test is any statistical hypothesis test in which the test statistic has a Students t distribution if the null hypothesis is true. ... The Gamma function along part of the real axis In mathematics, the Gamma function is an extension of the factorial function to complex numbers. ... In statistics, Hotellings T-square statistic, named for Harold Hotelling, is a generalization of Students t statistic that is used in multivariate hypothesis testing. ... Where the hell is this ? ... In statistics, a multivariate Student distribution is a multivariate generalization of the Students t-distribution. ... In this diagram, the bars represent observation means and the red lines represent the confidence intervals surrounding them. ...

References

William Sealy Gosset, 1876-1937 William Sealy Gosset (June 13, 1876–October 16, 1937) was an English chemist and statistician, best known by his pen name Student and for his work on Students t-distribution. ... Biometrika is a scientific journal established in 1901 by Francis Galton, Karl Pearson and W. F. R. Weldon to promote the study of biometrics, the statistical analysis of hereditary phenomena. ... Sir Ronald Aylmer Fisher, FRS (17 February 1890 – 29 July 1962) was a British statistician, evolutionary biologist, and geneticist. ... Abramowitz and Stegun is the informal moniker of a mathematical reference work edited by Milton Abramowitz and Irene Stegun of the U.S. National Bureau of Standards. ... The headquarters of the Cambridge University Press, in Trumpington Street, Cambridge. ...

External links

Image:Bvn-small.png Probability distributionsview  talk  edit ]
Univariate Multivariate
Discrete: BenfordBernoullibinomialBoltzmanncategoricalcompound Poissondiscrete phase-typedegenerateGauss-Kuzmingeometrichypergeometriclogarithmicnegative binomialparabolic fractalPoissonRademacherSkellamuniformYule-SimonzetaZipfZipf-Mandelbrot Ewensmultinomialmultivariate Polya
Continuous: BetaBeta primeCauchychi-squareDirac delta functionCoxianErlangexponentialexponential powerFfadingFermi-DiracFisher's zFisher-TippettGammageneralized extreme valuegeneralized hyperbolicgeneralized inverse GaussianHalf-LogisticHotelling's T-squarehyperbolic secanthyper-exponentialhypoexponentialinverse chi-square (scaled inverse chi-square) • inverse Gaussianinverse gamma (scaled inverse gamma) • KumaraswamyLandauLaplaceLévyLévy skew alpha-stablelogisticlog-normalMaxwell-BoltzmannMaxwell speedNakagaminormal (Gaussian)normal-gammanormal inverse GaussianParetoPearsonphase-typepolarraised cosineRayleighrelativistic Breit-WignerRiceshifted GompertzStudent's ttriangulartruncated normaltype-1 Gumbeltype-2 GumbeluniformVariance-GammaVoigtvon MisesWeibullWigner semicircleWilks' lambda DirichletGeneralized Dirichlet distribution . inverse-WishartKentmatrix normalmultivariate normalmultivariate Studentvon Mises-FisherWigner quasiWishart
Miscellaneous: Cantorconditionalequilibriumexponential familyinfinitely divisiblelocation-scale familymarginalmaximum entropyposteriorpriorquasisamplingsingular