|
Akaike's information criterion, developed by Hirotsugu Akaike under the name of "an information criterion" (AIC) in 1971 and proposed in Akaike (1974), is a measure of the goodness of fit of an estimated statistical model. It is grounded in the concept of entropy, in effect offering a relative measure of the information lost when a given model is used to describe reality. The AIC is an operational way of trading off the complexity of an estimated model against how well the model fits the data. Hirotsugu Akaike (Japanese: èµ¤æ± å¼æ¬¡ Akaike Hirotsugu; alternative Romanization: Hirotugu Akaike) (born November 5, 1927) is a Japanese statistician. ...
A statistical model is used in applied statistics. ...
Claude Shannon In information theory, the Shannon entropy or information entropy is a measure of the uncertainty associated with a random variable. ...
In probability theory and information theory, the Kullback-Leibler divergence (or information divergence, or information gain, or relative entropy) is a natural distance measure from a true probability distribution P to an arbitrary probability distribution Q. Typically P represents data, observations, or a precise calculated probability distribution. ...
Definition
In the general case, the AIC is  where k is the number of parameters in the statistical model, and L is the maximized value of the likelihood function for the estimated model. The factual accuracy of this article is disputed. ...
A statistical model is used in applied statistics. ...
In statistics, a likelihood function is a conditional probability function considered a function of its second argument with its first argument held fixed, thus: and also any other function proportional to such a function. ...
Over the remainder of this entry, it will be assumed that the model errors are normally and independently distributed. Let n be the number of observations and RSS be For other uses, see Observation (disambiguation). ...
 the residual sum of squares. Then AIC becomes In statistics, the residual sum of squares (RSS) is the sum of squares of residuals. ...
![AIC=2k + n[ln(2pi RSS/n) + 1],.](http://upload.wikimedia.org/math/1/9/5/195c8e56a0fafa742808fc4d6f78584d.png) Increasing the number of free parameters to be estimated improves the goodness of fit, regardless of the number of free parameters in the data generating process. Hence AIC not only rewards goodness of fit, but also includes a penalty that is an increasing function of the number of estimated parameters. This penalty discourages overfitting. The preferred model is the one with the lowest AIC value. The AIC methodology attempts to find the model that best explains the data with a minimum of free parameters. By contrast, more traditional approaches to modeling start from a null hypothesis. The AIC penalizes free parameters less strongly than does the Schwarz criterion. Noisy (roughly linear) data is fit to both linear and polynomial functions. ...
In statistics, a null hypothesis is a hypothesis set up to be nullified or refuted in order to support an alternative hypothesis. ...
In statistics, the Schwarz criterion (short for Schwarz information criterion, abbreviated SIC) is a statistical information criterion. ...
AIC judges a model by how close its fitted values trend to be to the true values, in terms of a certain expected value.
AICc and AICu AICc is AIC with a second order correction for small sample sizes, to start with:  Since AICc converges to AIC as n gets large, AICc should be employed regardless of sample size (Burnham and Anderson, 2004). McQuarrie and Tsai (1998: 22) define AICc as: and propose (p. 32) the closely related measure:  McQuarrie and Tsai ground their high opinion of AICc and AICu on extensive simulation work.
QAIC QAIC (the quasi-AIC) is defined as:  where c is a variance inflation factor. QAIC adjusts for over-dispersion or lack of fit. The small sample version of QAIC is: . References - Akaike, Hirotugu (1974). "A new look at the statistical model identification". IEEE Transactions on Automatic Control 19 (6): 716–723.
- Burnham, K. P., and D. R. Anderson, 2002. Model Selection and Multimodel Inference: A Practical-Theoretic Approach, 2nd ed. Springer-Verlag. ISBN 0-387-95364-7.
- --------, 2004. Multimodel Inference: understanding AIC and BIC in Model Selection, Amsterdam Workshop on Model Selection.
- Hurvich, C. M., and Tsai, C.-L., 1989. Regression and time series model selection in small samples. Biometrika, Vol 76. pp. 297-307
- McQuarrie, A. D. R., and Tsai, C.-L., 1998. Regression and Time Series Model Selection. World Scientific.
See also In statistics, the Schwarz criterion (also Schwarz information criterion (SIC) or Bayesian information criterion (BIC)) is an information criterion in statistics. ...
In statistics, deviance is a quantity whose expected values can be used for statistical hypothesis testing. ...
The DIC (Deviance Information Criteria) is a hierarchical modeling generalization of the AIC (Akaike Information Criteria). ...
In probability theory and statistics, the Jensen-Shannon divergence is a popular method of measuring the similarity between two probability distributions. ...
In probability theory and information theory, the Kullback-Leibler divergence (or information divergence, or information gain, or relative entropy) is a natural distance measure from a true probability distribution P to an arbitrary probability distribution Q. Typically P represents data, observations, or a precise calculated probability distribution. ...
For the House television show episode called Occams Razor, see Occams Razor (House episode) Occams razor (sometimes spelled Ockhams razor) is a principle attributed to the 14th-century English logician and Franciscan friar William of Ockham. ...
External links - Hirotogu Akaike comments on how he arrived at the AIC in This Week's Citation Classic
|