FACTOID # 129: The five countries with the highest coffee consumption are also the five countries whose citizens trust one another the most. Coincidence? Probably.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS   

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Goodness of fit

The Goodness of fit of a statistical model describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measures can be used in statistical hypothesis testing, e.g. to test for normality of residuals, to test whether two samples are drawn from identical distributions (see Kolmogorov-Smirnov test), or whether outcome frequencies follow a specified distribution (see Pearson's chi-square test). A statistical model is used in applied statistics. ... One may be faced with the problem of making a definite decision with respect to an uncertain hypothesis which is known only through its observable consequences. ... The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields. ... In statistics and optimization, the concepts of error and residual are easily confused with each other. ... In statistics, the Kolmogorov-Smirnov test is used to determine whether two underlying probability distributions differ from each other or whether an underlying probability distribution differs from a hypothesized distribution, in either case based in finite samples. ... Pearsons chi-square test (χ2) is one of a variety of chi-square tests – statistical procedures whose results are evaluated by reference to the chi-square distribution. ...


Example

The chi-square statistic is a sum of differences between observed and expected outcome frequencies, each squared and divided by the expectation: Pearsons chi-square test (χ2) is one of a variety of chi-square tests – statistical procedures whose results are evaluated by reference to the chi-square distribution. ... In probability theory the expected value (or mathematical expectation) of a random variable is the sum of the probability of each possible outcome of the experiment multiplied by its payoff (value). Thus, it represents the average amount one expects as the outcome of the random trial when identical odds are...

 chi^2 = sum {frac{(O - E)}{E}^2}

where:

O = an observed frequency
E = an expected (theoretical) frequency, asserted by the null hypothesis

The resulting value can be compared to the chi-square distribution to determine the goodness of fit. In statistics, a null hypothesis is a hypothesis set up to be nullified or refuted in order to support an alternative hypothesis. ... This article is about the mathematics of the chi-square distribution. ...


In order to determine the degrees of Freedom of the Chi-Squared distribution, one takes the total number of observed frequencies and subtracts one. For example, if there are eight different frequencies, one would compare to a chi-squared with seven degrees of freedom.


There is also a reduced chi-squared statistic, which is weighted based on measurement error.

 chi^2 = sum {frac{(O - E)^2}{sigma^2}}

where σ2 is the variance of the observation. [1] This article is about mathematics. ...


Binomial Case

A binomial experiment is a sequence of independent trials in which the trials can result in one of two outcomes, success or failure. There are n trials each with probability of success, denoted by p.
Provided that npi geq 5 for every i (where i=1,2,...,k), then


 chi^2 = sum_{i=1}^{k} {frac{(N_i - np_i)^2}{np_i}} = sum_{all,cells}^{} {frac{(Observed - Expected)^2}{Expected}}.

This has approximately a chi-squared distribution with k-1 df. The fact that df= k - 1 is a consequence of the restriction  sum N_i=n. We know there are k observed cell counts, however, once any k - 1 are known, the remaining one is uniquely determined. Basically, one can say, there are only k - 1 freely determined cell counts, thus df= k - 1.


References

  1. ^ http://www.sns.gov/workshops/sns_hfir_users/posters/Laub_Chi-Square_Data_Fitting.pdf

  Results from FactBites:
 
  More results at FactBites »

 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your location
Your comments
Please enter the 5-letter protection code


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.