FACTOID # 85: What is in a name? More than 90% of people in Bhutan, Burundi and Burkina Faso are involved in agriculture.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > List of statistical topics

Updated 177 days 12 hours 17 minutes ago.

Please add any Wikipedia articles related to statistics that are not already on this list. This article is about the field of statistics. ...


The "Related changes" link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most recent changes to this page, see the page history.


See also the list of probability topics, and the list of statisticians. This is a list of probability topics, by Wikipedia page. ... Statisticians or people who made notable contributions to the theories of statistics, or related aspects of probability, or machine learning: // Odd Olai Aalen (1947–) Gottfried Achenwall (1719–1772) Abraham Manie Adelstein (1916–1992) John Aitchison (1926–) Alexander Aitken (1895–1967) Aleyamma George Hirotsugu Akaike (1927–) Oskar Anderson (1887–1960) Peter...



Contents: Top - 0–9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z


[edit] A

The absolute deviation of an element of a data set is the absolute difference between that element and a given point. ... “Accuracy” redirects here. ... In classical (frequentist) decision theory, an admissible decision rule is a rule for making a decision that is better in some sense than any other rule that may compete with it. ... The Akaike information criterion (AIC) (pronounced ah-kah-ee-keh), developed by Hirotsugu Akaike in 1971 and proposed in Akaike (1974), is a measure of the goodness of fit of an estimated statistical model. ... Algorithms for calculating variance play a minor role in statistical computing. ... The Allan variance, named after David W. Allen, also known as two-sample variance, is a measurement of accuracy in clocks. ... 80 4-point near-alignments of 137 random points Statistics shows that if you put a large number of random points on a bounded flat surface you can find many alignments of random points. ... These are statistical procedures which can be used to analyse categorical data: regression analysis of variance linear modeling log-linear modeling logistic regression repeated measures analysis simple correspondence analysis multiple correspondence analysis contingency table Burt table binary table frequency table chi-square statistics odds ratios correlation statistics Fishers exact... In statistics, analysis of rhythmic variance (ANORVA) is a new simple method for detecting rhythms in biological time series, published by Peter Celec (Biol Res. ... In statistics, analysis of variance (ANOVA) is a collection of statistical models and their associated procedures which compare means by splitting the overall observed variance into different parts. ... In statistics, an ancillary statistic is a statistic whose probability distribution does not depend on which of the probability distributions among those being considered is the distribution of the statistical population from which the data were taken. ... ANCOVA, or analysis of covariance is an old-fashioned name for a linear regression model with one continuous explanatory variable and one or more factors. ... ASCA, ANOVA-SCA, or analysis of variance – simultaneous component analysis is a method that partitions variation and enables interpretation of these partitions by SCA, a method that is similar to PCA. This method is a multi or even megavariate extension of ANOVA. The variation partitioning is similar to Analysis of... An anomaly time series is the time series of deviations of a quantity from some mean. ... Approximate Bayesian computation (ABC) is a family of computational techniques in Bayesian statistics. ... In Survival analysis, the Area compatibility factor, F, is used in Indirect Standardisation of population mortality rates. ... In mathematics and statistics, the arithmetic mean (or simply the mean) of a list of numbers is the sum of all the members of the list divided by the number of items in the list. ... A plot showing 100 random numbers with a hidden sine function, and an autocorrelation of the series on the bottom. ... Autocorrelation is a mathematical tool used frequently in signal processing for analysing functions or series of values, such as time domain signals. ... In econometrics, an autoregressive conditional heteroskedasticity (ARCH, Engle (1982)) model considers the variance of the current error term to be a function of the variances of the previous time periods error terms. ... In statistics, an autoregressive integrated moving average (ARIMA) model is a generalisation of an autoregressive moving average or (ARMA) model. ... In statistics, autoregressive moving average (ARMA) models, sometimes called Box-Jenkins models after George Box and G. M. Jenkins, are typically applied to time series data. ...

[edit] B

The Balding-Nichols model is a statistical description of the allele frequencies in the components of a sub-divided population. ... Statistics are very important to baseball, perhaps as much as they are for cricket, and more than almost any other sport. ... In statistics, Basus theorem states that any complete sufficient statistic is independent of any ancillary statistic. ... Bayes theorem (also known as Bayes rule or Bayes law) is a result in probability theory, which relates the conditional and marginal probability distributions of random variables. ... Thomas Bayes (c. ... In statistics, the use of Bayes factors is a Bayesian alternative to classical hypothesis testing. ... Bayesian inference is statistical inference in which evidence or observations are used to update or to newly infer the probability that a hypothesis may be true. ... In statistics, Bayesian linear regression is a Bayesian alternative to the more well-known ordinary least-squares linear regression. ... The posterior probability of a model given data, P(H|D), is given by Bayes theorem: P(H|D) = P(D|H)P(H)/P(D) The key data_dependent term P(D|H) is a likelihood, and is sometimes called the evidence for model H; evaluating it correctly is the... A Bayesian network (or a belief network) is a probabilistic graphical model that represents a set of variables and their probabilistic independencies. ... Bayesian search theory is the application of Bayesian statistics to the search for lost objects. ... In statistics, the Behrens-Fisher problem is the problem of interval estimation and hypothesis testing concerning the difference between the means of two normally distributed populations when the variances of the two populations are not assumed to be equal, based on two independent samples. ... Belief propagation is an iterative algorithm for computing marginals of functions on a graphical model most commonly used in artificial intelligence and information theory. ... In statistics, Bessels correction, named after Friedrich Bessel, is the use of n âˆ’ 1 instead of n when estimating variance, where n is the number of observations in a sample. ... In empirical Bayes methods, the Beta-binomial model is an analytic model where the likelihood function is specifed by a binomial distribution and the conjugate prior is a Beta distribution // It is convenient to reparameterize the distributions so that the expected mean of the prior is a single parameter: Let... In probability theory and statistics, the beta distribution is a continuous probability distribution with the probability density function (pdf) defined on the interval [0, 1]: where α and β are parameters that must be greater than zero and B is the beta function. ... The Bhattacharya coefficient is an approximate measurement of the amount of overlap between two statistical samples. ... In statistics, the term bias is used for two different concepts. ... A biased sample is one that is falsely taken to be typical of a population from which it is drawn. ... Allan Birnbaum (May 27, 1923 - July 1, 1976) was an American statistician who contributed to statistical inference, foundations of statistics, statistical genetics, statistical psychology, and history of statistics. ... In probability theory, Chebyshevs inequality (also known as Tchebysheffs inequality, Chebyshevs theorem, or the Bienaymé-Chebyshev inequality), named after Pafnuty Chebyshev, who first proved it, states that in any data sample or probability distribution, nearly all the values are close to the mean value, and provides a... Binary classification is the task of classifying the members of a given set of objects into two groups on the basis of whether they have some property or not. ... In probability theory and statistics, the binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p. ... In statistics, the binomial test is an exact test of the statistical significance of deviations from a theoretically expected distribution of observations into two categories. ... In combinatorial mathematics, a block design (more fully, a balanced incomplete block design) is a particular kind of set system, which has long-standing applications to experimental design (an area of statistics) as well as purely combinatorial aspects. ... In the statistical theory of the design of experiments, blocking is the arranging of experimental units in groups (blocks) which are similar to one another. ... In statistics, the Bonferroni correction states that if an experimenter is testing n independent hypotheses on a set of data, then the statistical significance level that should be used for each hypothesis separately is 1/n times what it would be if only one hypothesis were tested. ... Bootstrap aggregating (bagging) is a meta-algorithm to improve classification and regression models in terms of stability and classification accuracy. ... In statistics, bootstrapping is a modern, computer intensive, general purpose approach to statistical inference, falling within a broader class of resampling methods. ... Data is taken to be either a scalar number, a vector or a matrix. ... In statistics, the Box-Cox transformation of the variable Y given the Box-Cox parameter λ ≥ 0 is defined as This transformation has proved popular in regression analysis, including econometrics. ... Figure 1. ... Leo Breiman (January 27, 1928–July 7, 2005) was a distinguished statistician at the University of California, Berkeley. ... In statistics, the Breusch-Pagan test is used to test for heteroskedasticity in a linear regression model. ... Ladislaus Josephovich Bortkiewicz (August 7, 1868 - July 15, 1931) was a Russian economist and statistician of Polish descent. ... Business statistics is the science of good decision making in the face of uncertainty and is used in many disciplines such as financial analysis, econometrics, auditing, production and operations including services improvement, and marketing research. ...

[edit] C

Calibrated probability assessments are subjective probabilities assigned by individuals who have been trained to assess probabilities in a way that historically represents their uncertainty[1][2]. In other words, when a calibrated person says they are 80% confident in each of 100 predictions they made, they will get about 80... Calibration in statistics is a reverse process to regression. ... Canonical Analysis - Wikipedia /**/ @import /skins-1. ... In statistics, canonical correlation analysis, introduced by Harold Hotelling, is a way of making sense of cross-covariance matrices. ... The level of measurement of a variable in mathematics and statistics describes how much information the numbers associated with the variable contain. ... In mathematics, the Cauchy-Schwarz inequality, also known as the Schwarz inequality, the Cauchy inequality, or the Cauchy-Bunyakovski-Schwarz inequality, named after Augustin Louis Cauchy, Viktor Yakovlevich Bunyakovsky and Hermann Amandus Schwarz, is a useful inequality encountered in many different settings, such as linear algebra applied to vectors, in... In statistics, censoring occurs when the value of an observation is only partially known. ... A central limit theorem is any of a set of weak-convergence results in probability theory. ... In statistics, the Chapman–Robbins bound or Hammersley–Chapman–Robbins bound is a lower bound on the variance of estimators of a deterministic parameter. ... In probability theory, the characteristic function of any random variable completely defines its probability distribution. ... Chauvenets Criterion is a means of assessing whether one piece of experimental data — an outlier — from a set of observations, is spurious. ... In probability theory, Chebyshevs inequality (also known as Tchebysheffs inequality, Chebyshevs theorem, or the Bienaymé-Chebyshev inequality), named after Pafnuty Chebyshev, who first proved it, states that in any data sample or probability distribution, nearly all the values are close to the mean value, and provides a... It has been suggested that this article or section be merged with Checking if a coin is fair. ... In probability theory, the Chernoff bound, named after Herman Chernoff, gives a lower bound for the success of majority agreement for n independent, equally likely events. ... In probability theory, Chernoffs inequality, named after Herman Chernoff, states the following. ... In probability theory and statistics, the chi distribution is a continuous probability distribution. ... This article is about the mathematics of the chi-square distribution. ... A chi-square test is any statistical hypothesis test in which the test statistic has a chi-square distribution when the null hypothesis is true, or any in which the probability distribution of the test statistic (assuming the null hypothesis is true) can be made to approximate a chi-square... The Chow test is an econometric test of whether the coefficients in two linear regressions on different data are equal. ... Circular or directional statistics is the subdiscipline of statistics that deals with circular data. ... Several classic data sets that have been used extensively in the statistical literature. ... In health care, including medicine, a clinical trial (synonyms: clinical studies, research protocols, medical research) is a process in which a medicine or other medical treatment is tested for its safety and effectiveness, often in comparison to existing treatments. ... In statistics, the closed testing procedure [1] is a general method for performing more than one hypothesis test simultaneously. ... Cochrane-Orcutt estimation is a procedure in econometrics, which adjusts a linear model for serial correlation in the error term. ... In statistics, Cochrans theorem is used in the analysis of variance. ... In statistics, the coefficient of determination R2 is the proportion of variability in a data set that is accounted for by a statistical model. ... In probability theory and statistics, the coefficient of variation (CV) is a measure of dispersion of a probability distribution. ... Cohens kappa coefficient is a statistical measure of interrater reliability. ... Special cause Common- and special-causes are the two distinct origins of variation, in a process that features in the statistical thinking and methods of Walter A. Shewhart and W. Edwards Deming. ... The following tables compare general and technical information for a number of statistical analysis packages. ... In probability theory, two events are called complementary if and only if precisely one of the possibilities must occur. ... Suppose a random variable (which may be a sequence () of scalar-valued random variables), has a probability distribution belonging to a known family of probability distributions, parametrized by θ, which may be either vector- or scalar-valued, and let be any statistic based on . ... In statistics, compositional data is data in which each data point is an n-tuple of nonnegative numbers whose sum is 1. ... In statistics, computational learning theory is a mathematical field related to the analysis of machine learning algorithms. ... In statistics, the concordance correlation coefficient measures the agreement between two variables, e. ... In statistics, a concordant pair is a pair of a two-variable (bivariate) observation data-set {X1,Y1} and {X2,Y2}, where: Correspondingly, a discordant pair is a pair, as defined above, where and the sign function, often represented as sgn, is defined as: Kendall tau distance Spearmans rank... This article illustrates the central limit theorem via an example for which the computation can be done quickly by hand on paper, unlike the more computing-intensive example in the article titled illustration of the central limit theorem. ... In statistics, the conditional change model is the analytic procedure in which change scores are regressed on baseline values, together with the explanatory variables of interest (often including an indicator of a treatment group). ... Given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X (written Y | X) is the probability distribution of Y when X is known to be a particular value. ... In probability theory, two events A and B are conditionally independent given a third event C precisely if the occurrence or non-occurrence of A and B are independent events in their conditional probability distribution given C. In other words, Two random variables X and Y are conditionally independent given... This article defines some terms which characterize probability distributions of two or more variables. ... In this diagram, the bars represent observation means and the red lines represent the confidence intervals surrounding them. ... In statistics, a confounding factor is a factor which is the common cause of two things that may falsely appear to be in a causal relationship. ... Conjoint analysis, also called multi-attribute compositional models, is a statistical technique that originated in mathematical psychology. ... In statistics, a consistent estimator is an estimator that converges in probability to the quantity being estimated as the sample size grows. ... In statistics, contingency tables are used to record and analyse the relationship between two or more variables, most usually categorical variables. ... In mathematics, a probability distribution assigns to every interval of the real numbers a probability, so that the probability axioms are satisfied. ... In statistical process control, the control chart, also known as the Shewhart chart or process-behaviour chart is a tool to determine whether a manufacturing or business process is in a state of statistical control or not. ... Control limits are horizontal lines drawn on an SPC control chart, usually at a distance of ±3 standard deviations from the mean of the plotted statistic. ... In Monte Carlo methods, one or more control variates may be employed to achieve variance reduction by exploiting the correlation between statistics. ... Controlling for a variable means to deliberately vary the experimental conditions in order to take that variable into account in the prediction of the response variable. ... In statistics, a copula is a multivariate cumulative distribution function defined on the n-dimensional unit cube [0, 1]n such that every marginal distribution is uniform on the interval [0, 1]. Sklars theorem is as follows. ... Positive linear correlations between 1000 pairs of numbers. ... To meet Wikipedias quality standards, this article or section may require cleanup. ... In statistics, the correlation ratio is a measure of the relationship between the statistical dispersion within individual categories and the dispersion across the whole population or sample. ... In statistics, and especially in the statistical analysis of psychological data, the counternull is a statistic used to aid the understanding and presentation of research results. ... In probability theory and statistics, the covariance between two real-valued random variables X and Y, with expected values and is defined as: where E is the expected value. ... In statistics and probability theory, the covariance matrix is a matrix of covariances between elements of a vector. ... Cricket is a sport that generates a large number of statistics. ... Cronbachs (alpha) has an important use as a measure of the reliability of a psychometric instrument. ... Cross tabs (or cross tabulations) display the joint distribution of two or more variables. ... In statistics cross-validation is the practice of partitioning a sample of data into subsamples such that analysis is initially performed on a single subsample, while further subsamples are retained blind in order for subsequent use in confirming and validating the initial analysis. ... // Cumulants of probability distributions In probability theory and statistics, the cumulants κn of the probability distribution of a random variable X are given by In other words, κn/n! is the nth coefficient in the power series representation of the logarithm of the moment-generating function. ... In probability theory, the cumulative distribution function (abbreviated cdf) completely describes the probability distribution of a real-valued random variable, X. For every real number x, the cdf is given by where the right-hand side represents the probability that the random variable X takes on a value less than... Curve fitting is finding a curve which matches a series of data points and possibly other constraints. ... Harald Cramér (September 25, 1893 - October 5, 1985) was a Swedish mathematician and statistician, specialised in mathematical statistics. ... In statistics, the Cramér-Rao bound (CRB) or Cramér-Rao lower bound (CRLB), named in honor of Harald Cramér and Calyampudi Radhakrishna Rao, expresses a lower bound on the variance of estimators of a deterministic parameter. ... In statistics the Cramér-von-Mises criterion for judging the goodness of fit of a probability distribution compared to a given distribution is given by In applications is the theoretical distribution and is the empirically observed distribution. ...

[edit] D

The introduction to this article provides insufficient context for those unfamiliar with the subject matter. ... Clustering is the classification of objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters), so that the data in each subset (ideally) share some common trait - often proximity according to some defined distance measure. ... Kurt Thearling, An Introduction to Data Mining (also available is a corresponding online tutorial) Dean Abbott, I. Philip Matkovsky, and John Elder IV, Ph. ... In statistics, a data point is a single typed measurement. ... A data set (or dataset) is a collection of data, usually presented in tabular form. ... In statistics, data transformation is carried in order to transform the data and assure that it has a normal distribution (a remedy for outliers, failures of normality, linearity, and homoscedasticity). ... In probability theory, de Finettis theorem explains why exchangeable observations are conditionally independent given some (usually) unobservable quantity to which an epistemic probability distribution would then be assigned. ... Decision theory is an area of study of discrete mathematics that models human decision-making in science, engineering and indeed all human social activities. ... This article or section is in need of attention from an expert on the subject. ... In statistics, the delta method is a method for deriving an approximate probability distribution for a function of an asymptotically normal statistical estimator from knowledge of the limiting variance of that estimator. ... In statistics, Deming regression, named after W. Edwards Deming, is a method of linear regression that finds a line of best fit for a set of related data. ... Demographics refers to selected population characteristics as used in government, marketing or opinion research, or the demographic profiles used in such research. ... Map of countries by population Population growth showing projections for later this century Demography is the statistical study of human populations. ... Among the kinds of data that national leaders need are the demographic statistics of their population. ... In probability and statistics, density estimation is the construction of an estimate, based on observed data, of an unobservable underlying probability density function. ... A design matrix is a matrix that is used in the certain statistical models, e. ... Descriptive statistics are used to describe the basic features of the data in a study. ... The first statistician to consider a methodology for the design of experiments was Sir Ronald A. Fisher. ... Detection theory, or signal detection theory, is a means to quantify the ability to discern between signal and noise. ... In statistics, deviance is a quantity whose expected values can be used for statistical hypothesis testing. ... The DIC (Deviance Information Criteria) is a hierarchical modeling generalization of the AIC (Akaike Information Criteria). ... In statistics, the Dickey-Fuller test tests whether a unit root is present in an autoregressive model. ... In statistics, dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction. ... Circular or directional statistics is the subdiscipline of statistics that deals with circular or directional data. ... Discrete choice analysis is a statistical technique. ... In mathematics, a probability distribution assigns to every interval of the real numbers a probability, so that the probability axioms are satisfied. ... In regression analysis, a dummy variable is one that takes the values 0 or 1 to indicate the absence or presence of some categorical effect that may be expected to shift the outcome. ... In statistics, Duncans new Multiple Range Test (MRT) is a multiple comparison procedure developed by David B Duncan in 1955. ... In gambling a Dutch book or lock is a set of odds and bets which guarantees a profit, no matter what the outcome of the gamble. ...

[edit] E

An ecological correlation is a correlation between two variables that are group means, in contrast to a correlation between two variables that describe individuals. ... The ecological fallacy is a widely recognised error in the interpretation of statistical data, whereby inferences about the nature of individuals are based solely upon aggregate statistics collected for the group to which those individuals belong. ... Econometrics literally means economic measurement. It is the branch of economics that applies statistical methods to the empirical study of economic theories and relationships. ... The Edgeworth series or Gram-Charlier A series, named in honor of Francis Ysidro Edgeworth, are series that approximate a probability distribution in terms of its cumulants. ... In statistics, effect size is a measure of the strength of the relationship between two variables. ... In statistics, efficiency is one measure of desirability of an estimator. ... In statistics, empirical Bayes methods involve: An underlying probability distribution of some unobservable quantity assigned to each member of a statistical population. ... In statistics, an empirical distribution function is a cumulative probability distribution function that concentrates probability 1/n at each of the n numbers in a sample. ... Suppose is a sample space of observations. ... Energy statistics refers to collecting, compiling, analyzing and disseminating data on commodities such as coal, crude oil, natural gas, electricity, or renewable energy sources (biomass, geothermal, wind or solar energy), when they are used for the energy they contain. ... This article is one of a group being considered for deletion in accordance with Wikipedias deletion policy. ... Full name Tore Olaus Engset born 1865­, died 1943. ... Agner Krarup Erlang (January 1, 1878–February 3, 1929) was a Danish mathematician, statistician, and engineer who invented the fields of queueing theory and traffic engineering. ... In statistics and optimization, the concepts of error and residual are easily confused with each other. ... Errors-in-Variables is a robust modeling technique in statistics, which assumes that every variable can have error or noise. ... Estimation is the calculated approximation of a result which is usable even if input data may be incomplete, uncertain, or noisy. ... Estimation theory is a branch of statistics and signal processing that deals with estimating the values of parameters based on measured/empirical data. ... In multivariate statistics, the importance of the Wishart distribution stems in part from the fact that it is the probability distribution of the maximum likelihood estimator of the covariance matrix of a multivariate normal distribution. ... In statistics, an estimator is a function of the observable sample data that is used to estimate an unknown population parameter; an estimate is the result from the actual application of the function to a particular set of data. ... In population genetics, Ewenss sampling formula, introduced by Warren Ewens, states that under certain conditions (specified below), if a random sample of n gametes is taken from a population and classified according to the gene at a particular locus then the probability that there are a1 alleles represented once... An exact (significance) test is a test where all assumptions that the derivation of the distribution of the test statistic is based on are met. ... In probability theory the expected value (or mathematical expectation) of a random variable is the sum of the probability of each possible outcome of the experiment multiplied by its payoff (value). Thus, it represents the average amount one expects as the outcome of the random trial when identical odds are... In statistical computing, an expectation-maximization (EM) algorithm is an algorithm for finding maximum likelihood estimates of parameters in probabilistic models, where the model depends on unobserved latent variables. ... Experimental research designs are used for the controlled testing of causal processes. ... In statistics, an explained sum of squares (ESS) is the sum of squared predicted values in a standard regression model (for example ), where is the response variable, is the explanatory variable, and are coefficients, indexes the observations from to , and is the error term. ... In statistics, an explanatory variable (also regressor or independent variable) is a variable in a regression model which appears on the right hand side of the equation. ... Exploratory data analysis (EDA) is that part of statistical practice concerned with reviewing, communicating and using data where there is a low level of knowledge about its cause system. ... In probability theory and statistics, the exponential distributions are a class of continuous probability distribution. ... In probability and statistics, an exponential family is any class of probability distributions having a certain form. ... In statistics, exponential smoothing refers to a particular type of moving average technique applied to time series data, either to produce smoothed data for presentation, or to make forecasts. ... Extreme value theory is a branch of statistics dealing with the extreme deviations from the median of probability distributions. ...

[edit] F

Failure rate is the frequency with which an engineered system or component fails, expressed for example in failures per hour. ... In statistics and probability, the F-distribution is a continuous probability distribution. ... An F-test is any statistical test in which the test statistic has an F-distribution if the null hypothesis is true. ... Factor analysis is a statistical data reduction technique used to explain variability among observed random variables in terms of fewer unobserved random variables called factors. ... In statistics, a factorial experiment is an experiment whose design consists of two or more factors, each with discrete possible values or levels, and whose experimental units take on all possible combinations of these levels across all such factors. ... Coin flipping or coin tossing is the practice of throwing a coin in the air to resolve a dispute between two parties. ... False discovery rate (FDR) control is a statistical method used in multiple hypothesis testing to correct for multiple comparisons. ... Type I errors (or α error, or false positive) and type II errors (β error, or a false negative) are two terms used to describe statistical errors. ... Type I errors (or α error, or false positive) and type II errors (β error, or a false negative) are two terms used to describe statistical errors. ... In statistics, familywise error rate (FWER) is the probability of making one or more false discoveries, or type I errors among all the hypotheses when performing multiple pairwise tests[1][2]. // The m specific hypotheses of interest are assumed to be known in advance, but the numbers of true null... In applied statistics, the file drawer problem results from the fact that academics tend not to publish results that indicate the null hypothesis could not be rejected. ... In statistics and information theory, the Fisher information (denoted ) is the variance of the score. ... Sir Ronald Aylmer Fisher, FRS (17 February 1890 – 29 July 1962) was an English statistician, evolutionary biologist, and geneticist. ... Fishers exact test is a statistical significance test used in the analysis of categorical data where sample sizes are small. ... Linear discriminant analysis (LDA), is sometimes known as Fishers linear discriminant, after its inventor, Ronald A. Fisher, who published it in The Use of Multiple Measures in Taxonomic Problems (1936). ... In statistics, Fishers method is a data fusion or meta-analysis (analysis after analysis) technique for combining the results from a variety of independent tests bearing upon the same overall hypothesis (H0) as if in a single large test. ... In statistics, hypotheses about the value of r, the correlation coefficient between variables x and y of the underlying population, can be tested using the Fisher transformation applied to r. ... It has been suggested that this article or section be merged with fixed effects model. ... // Fleiss kappa is a generalisation of Scotts pi statistic, a statistical measure of inter-rater reliability. ... In statistics a forecast error is the difference between the actual/real and the predicted/forecast value of a time series. ... A forest plot is a graph displaying the results of multiple studies in a meta-analysis. ... In statistics, fractional factorial designs are experimental designs consisting of a carefully chosen subset (fraction) of the experimental runs of a full factorial design. ... Freedman-Diaconis rule is used to specify the number of bins to be used in a histogram. ... In statistics, a frequency distribution is a list of the values that a variable takes in a sample. ... Statistical regularity has motivated the development of the relative frequency concept of probability. ... Functional data analysis is a series of techniques in statistics for characterizing a series of data points as a single piece of data. ...

[edit] G

In statistics, G-tests are likelihood-ratio or maximum likelihood statistical significance tests that are increasingly being used in situations where chi-square tests were previously recommended. ... The Galton-Watson process is a stochastic process arising from Francis Galtons statistical investigation of the extinction of surnames. ... Galton’s problem, named after Sir Francis Galton, is the problem of drawing inferences from cross-cultural data, due to the statistical phenomenon now called autocorrelation. ... In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. ... This article is not about Gauss-Markov processes. ... In statistics, the generalized canonical correlation analysis (gCCA), is a way of making sense of cross-correlation matrices between the sets of random variables when there are more than two sets. ... In statistics, the generalized linear model (GLM) is a useful generalization of ordinary least squares regression. ... The generalized method of moments is a very general statistical method for obtaining estimates of parameters of statistical models. ... There are very few or no other articles that link to this one. ... This article or section is in need of attention from an expert on the subject. ... In mathematics and physics, Gibbs sampling is an algorithm to generate a sequence of samples from the joint probability distribution of two or more random variables. ... Graphical representation of the Gini coefficient The Gini coefficient is a measure of inequality of income distribution or inequality of wealth distribution. ... Good–Turing Frequency Estimation is a statistical technique for predicting the probability of occurrence of objects belonging to an unknown number of species, given past observations of such objects and their species. ... Goodness of fit means how well a statistical model fits a set of observations. ... William Sealy Gosset (June 13, 1876 – October 16, 1937) was a chemist and statistician, better known by his pen name Student. ... An n×n Graeco-Latin square is a table, each cell of which contains a pair of symbols, composed of a symbol from each of two sets of n elements. ... In probability theory and statistics, a graphical model (GM) represents dependencies among random variables by a graph in which each random variable is a node. ...

[edit] H

In economics, the Herfindahl index is a measure of the size of firms in relationship to the industry and an indicator of the amount of competition among them. ... In statistics, Halton sequences are well-known quasi-random sequences, first introduced in 1960 as an alternative to pseudo-random number sequences. ... In statistics, the Hannan-Quinn information criterion (HQC) is an alternative to Akaike Information Criterion (AIC) and Bayesian information criterion (BIC). ... The Hausman specification test is the first easy method allowing scientists to evaluate if their statistical models correspond to the data. ... The hazard ratio in survival analysis is a summary of the difference between two survival curves, representing the reduction in the risk of death on treatment compared to control, over the period of follow-up. ... In statistics, a sequence or a vector of random variables is heteroskedastic if the random variables in the sequence or vector may have different variances. ... In statistics, a frequent assumption in linear regression is that the disturbances ui have the same variance. ... State transitions in a hidden Markov model (example) x — hidden states y — observable outputs a — transition probabilities b — output probabilities A hidden Markov model (HMM) is a statistical model in which the system being modeled is assumed to be a Markov process with unknown parameters, and the challenge is to... Hierarchical linear modeling (HLM), also known as multi-level analysis, is a more advanced form of simple linear regression and multiple linear regression. ... For the histogram used in digital image processing, see Color histogram. ... In statistics, the Holm-Bonferroni method [1] performs more than one hypothesis test simultaneously. ... This article or section is in need of attention from an expert on the subject. ... In statistics, a sequence or a vector of random variables is homoscedastic if all random variables in the sequence or vector have the same finite variance. ... In statistics, Hotellings T-square statistic, named for Harold Hotelling, is a generalization of Students t statistic that is used in multivariate hypothesis testing. ... The Howland will forgery trial was a US court case in 1868 to decide Henrietta Howland Robinsons contest of the will of Sylvia Ann Howland. ... In econometrics, Huber-White standard errors are standard errors that are adjusted for correlations of error terms across observations, especially in panel and survey data as well as data with cluster structure. ... The Hubbert curve, named after the geophysicist M. King Hubbert, is the derivative of the logistic curve. ...

[edit] I

Here is an illustration of the central limit theorem. ... There is also an imputation disambiguation page. ... Independent component analysis (ICA) is a computational method for separating a multivariate signal into additive subcomponents supposing the mutual statistical independence of the non-Gaussian source signals. ... In probability theory, a sequence or other collection of random variables is independent and identically distributed (i. ... For probability distributions having an expected value and a median, the mean (i. ... It has been suggested that this article or section be merged with statistical inference. ... The information bottleneck method is a technique for finding the best trade-off between accuracy and compression when summarizing (e. ... This article or section is in need of attention from an expert on the subject. ... In statistics, an instrumental variable (IV, or instrument) can be used in regression analysis to produce a consistent estimator when the explanatory variables (covariates) are correlated with the error terms. ... A graphical depiction of a statistical interaction in which the extent to which experience impacts cost depends on decision time. ... In statistics, the interclass correlation (or interclass correlation coefficient) measures a bivariate relation among variables. ... In statistics, interclass dependence (or class interdependence) means that the occurrence of one class is probabilistically dependent on other classes that may occur in the same space. ... In descriptive statistics, the interquartile range (IQR), also called the midspread and middle fifty is the range between the third and first quartiles and is a measure of statistical dispersion. ... Inter-rater reliability or Inter-rater agreement is the measurement of agreement between raters. ... In statistics, interval estimation is the use of sample data to calculate an interval of possible (or probable) values of an unknown population parameter. ... An intervening variable is a hypothetical construct that attempts to explain relationships between variables, and especially the relationships between independent variables and dependent variables. ... In statistics, the intraclass correlation (or the intraclass correlation coefficient[1]) is a measure of correlation, consistency or conformity for a data set when it has multiple groups. ... In statistics, the Inverse Wishart distribution, also the inverse Wishart distribution and inverted Wishart distribution is a probability density function defined on matrices. ... Inverse transform sampling , also known as the probability integral transform, is a method of sampling a number at random from any probability distribution given its cumulative distribution function (cdf). ... Item response theory (IRT) is a body of related psychometric theory that provides a foundation for scaling persons and items based on responses to assessment items. ... The method of iteratively re-weighted least squares (IRLS) is a numerical algorithm for minimizing any specified objective function using a standard weighted least squares method such as Gaussian elimination. ...

[edit] J

The James-Stein estimator is a nonlinear estimator which can be shown to dominate, or outperform, the ordinary (least squares) estimator. ... In statistics, the Jarque-Bera test is a goodness-of-fit measure of departure from normality, based on the sample kurtosis and skewness. ... In Bayesian probability, the Jeffreys prior is a noninformative prior distribution proportional to the square root of the Fisher information: and is invariant under reparameterization of . ...

[edit] K

The Kaplan-Meier estimator (also known as the Product Limit Estimator) estimates the survival function from life-time data. ... Cohens kappa coefficient is a statistical measure of inter-rater reliability. ... A kappa statistic is a measure of degree of nonrandom agreement between observers and/or measurements of a specific categorical variable. ... The Kendall tau distance is a metric that counts the number of pairwise disagreements between two lists. ... The Kendall tau rank correlation coefficient (or simply the Kendall tau coefficient, Kendalls Ï„ or Tau test(s)) is used to measure the degree of correspondence between two rankings and assessing the significance of this correspondence. ... The 5-parameter Fisher-Bingham distribution or Kent distribution is a probability distribution on the three-dimensional sphere. ... A Kernel is a weighting function used in non-parametric estimation techniques. ... In statistics, the Kolmogorov-Smirnov test (often called the K-S test) is used to determine whether two underlying probability distributions differ, or whether an underlying probability distribution differs from a hypothesized distribution, in either case based on finite samples. ... Kriging is group of geostatistical techniques to interpolate the value of a random field (e. ... In statistics, the Kruskal-Wallis one-way analysis of variance by ranks (named after William Kruskal and Allen Wallis) is a non-parametric method. ... In statistics, Kuipers test is closely related to the more well-known Kolmogorov-Smirnov test (or K-S test as it is often called). ... In probability theory and information theory, the Kullback-Leibler divergence (or information divergence, or information gain, or relative entropy) is a natural distance measure from a true probability distribution P to an arbitrary probability distribution Q. Typically P represents data, observations, or a precise calculated probability distribution. ... The far red light has no effect on the average speed of the gravitropic reaction in wheat coleoptiles, but it changes kurtosis from platykurtic to leptokurtic (-0. ...

[edit] L

Latent variables, as opposed to observable variables, are those variables that cannot be directly observed but are rather inferred from other variables that can be observed and directly measured. ... It has been suggested that this article or section be merged with latent variable. ... In statistics the latent class model (LCM) relates a set of discrete multivariate variables to a set of latent variables. ... A Latin square is an n × n table filled with n different symbols in such a way that each symbol occurs exactly once in each row and exactly once in each column. ... The statistical method Latin hypercube sampling (LHS) was developed along by Ronald L. Iman, J. C. Helton, and J. E. Campbell, et al to gener