FACTOID # 116: Norwegians drink 10.7 kilograms of coffee per person each year, and also lead the globe in anxiety disorders. Time to switch to herbal tea?
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS   

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Squared deviations

Contents

Introduction

The definition of variance is either the expected value (when considering a theoretical distribution), or average (for actual experimental data) of squared deviations from the mean. Computations for analysis of variance involve the partitioning of a sum of squared deviations. An understanding of the complex computations involved is greatly enhanced by a detailed study of the statistical value:
operatorname{E}( x ^ 2 ) In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ... In statistics, analysis of variance (ANOVA) is a collection of statistical models and their associated procedures which compare means by splitting the overall observed variance into different parts. ...


It is well-known that for a random variable x with mean μ and variance σ2:

sigma^2 = operatorname{E}( x ^ 2 ) - mu^2[1]

Therefore:

operatorname{E}( x ^ 2 ) = sigma^2 + mu^2

From the above, the following are readly derived:

operatorname{E}( sum( x ^ 2) ) = nsigma^2 + nmu^2

operatorname{E}( (sum x )^ 2 ) = nsigma^2 + n^2mu^2



Sample Variance


The sum of squared deviations needed to calculate variance (before deciding whether to divide by n or n-1) is most easily calculated as:

S = sum x ^ 2 - (sum x)^2/n

From the two derived expectations above the expected value of this sum is:

operatorname{E}(S) = nsigma^2 + nmu^2 - (nsigma^2 + n^2mu^2)/n

Giving:

operatorname{E}(S) = (n - 1)sigma^2

This effectively proves the use of the divisor (n − 1) in the calculation of an unbiassed sample estimate of σ2

In probability theory and statistics, the variance of a random variable (or somewhat more precisely, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ...


Partition - Analysis of Variance


In the situation where data is available for k different treatment groups having size ni where i varies from 1 to k, then it is assumed that the expected mean of each group is:

operatorname{E}(mu_i) = mu + T_i

and the variance of each treatment group is unchanged from the population variance σ2.
Under the Null Hyporthesis that the treatments have no effect, then each of the Ti will be zero.
It is now possible to calculate three sums of squares:


Individual

I = sum x^2
operatorname{E}(I) = nsigma^2 + nmu^2

Treatments

T = sum_{i=1}^k ((sum x)^2/n_i)
operatorname{E}(T) = ksigma^2 + sum_{i=1}^k n_i(mu + T_i)^2
operatorname{E}(T) = ksigma^2 + nmu^2 + 2mu sum_{i=1}^k (n_iT_i) + sum_{i=1}^k n_i(T_i)^2

Under the null hypothesis that the treatments cause no differences and all the Ti are zero, the expectation simplifies to:

operatorname{E}(T) = ksigma^2 + nmu^2

Combination

C = (sum x)^2/n
operatorname{E}(C) = sigma^2 + nmu^2

Sums of Squared Deviations

Under the null hypothesis, the difference of any pair of I, T, and C does not contain any dependency on μ, only σ2.

operatorname{E}(I - C) = (n - 1)sigma^2 Total Squared Deviations
operatorname{E}(T - C) = (k - 1)sigma^2 Treatment Squared Deviations
operatorname{E}(I - T) = (n - k)sigma^2 Residual Squared Deviations

The constants (n-1), (k-1), and (n-k) are normally referred to as the number of degrees of freedom.


Example

In a very simple example, 5 observations arise from two treatments. The first treatment gives three values 1, 2, and 3, and the second treatment gives two values 4, and 6.


I = frac{1^2}{1} + frac{2^2}{1} + frac{3^2}{1} + frac{4^2}{1} + frac{6^2}{1} = 66

T = frac{(1 + 2 + 3)^2}{3} + frac{(4 + 6)^2}{2} = 12 + 50 = 62

C = frac{(1 + 2 + 3 + 4 + 6)^2}{5} = 256/5 = 51.2

Giving:
Total squared deviations = 66 - 51.2 = 14.8 with 4 degrees of freedom.
Treatment squared deviations = 62 - 51.2 = 10.8 with 1 degree of freedom.
Residual squared deviations = 66 = 62 = 4 with 3 dgrees of freedom.


Two-Way Analysis of Variance

The following hypothetical example gives the yields of 15 plants subject to two environmental variations, and three fertilisers.

Extra CO2 Extra Humidity
No Fertiliser 7, 2, 1 7, 6
Nitrate 11, 6 10, 7, 3
Phosphate 5, 3, 4 11, 4

Five sums of squares are calculated:

Factor Calculation Sum σ2
Individual 72 + 22 + 12 + 72 + 62 + 112 + 62 + 102 + 72 + 32 + 52 + 32 + 42 + 112 + 42 641 15
Fertiliser x Environment frac{(7+2+1)^2}{3} + frac{(7+6)^2}{2} + frac{(11+6)^2}{2} + frac{(10+7+3)^2}{3} + frac{(5+3+4)^2}{3} + frac{(11+4)^2}{2} 556.1667 6
Fertiliser frac{(7+2+1+7+6)^2}{5} + frac{(11+6+10+7+3)^2}{5} + frac{(5+3+4+11+4)^2}{5} 525.4 3
Environment frac{(7+2+1+11+6+5+3+4)^2}{8} + frac{(7+6+10+7+3+11+4)^2}{7} 519.2679 2
Composite frac{(7+2+1+11+6+5+3+4+7+6+10+7+3+11+4)^2}{15} 504.6 1



Finally, the sums of squared deviations required for the Analysis of Variance can be calculated.
In statistics, analysis of variance (ANOVA) is a collection of statistical models and their associated procedures which compare means by splitting the overall observed variance into different parts. ...

Factor Sum σ2 Total Environment Fertiliser Fertiliser x Environment Residual
Individual 641 15 1 1
Fertiliser x Environment 556.1667 6 1 -1
Fertiliser 525.4 3 1 -1
Environment 519.2679 2 1 -1
Composite 504.6 1 -1 -1 -1 1
Squared Deviations 136.4 14.668 20.8 16.099 84.833
Degrees of Freedom 14 1 2 2 9

References

1 Mood & Graybill: An introduction to the Theory of Statistics (McGraw Hill)
2 Variance_decomposition
The well-known variance decomposition rule is given by: See also iterated expectations and law of total variance for proof. ...


  Results from FactBites:
 
  More results at FactBites »

 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your location
Your comments
Please enter the 5-letter protection code


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.