FACTOID # 44: Three quarters of Japanese kids read comics.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

FACTS & STATISTICS    Simple view

  1. Select countries to view: (hold down Control key and click to select several)

     

     

    Compare:

     

     

  1. Select fact or statistic: (* = graphable)

     

     

     

  2. (OPTIONAL) Compare to statistic: (both need to be graphable)

     

     

     

  3. View result as:

     

       
(OR) SEARCH ALL encyclopedia, stats & forums:   

Encyclopedia > Linear model

In statistics the linear model is given by A graph of a normal bell curve showing statistics used in educational assessment and comparing various grading methods. ...

Y = X beta + varepsilon

where Y is an n×1 column vector of random variables, X is an n×p matrix of "known" (i.e. observable and non-random) quantities, whose rows correspond to statistical units, β is a p×1 vector of (unobservable) parameters, and ε is an n×1 vector of "errors", which are uncorrelated random variables each with expected value 0 and variance σ2. In different statistical disciplines, the statistical unit is the source of a random variable. ... In statistics and optimization, the concepts of error and residual are easily confused with each other. ... Positive linear correlations between 1000 pairs of numbers. ... In probability theory, a random variable is a quantity whose values are random and to which a probability distribution is assigned. ...


Much of the theory of linear models is associated with inferring the values of the parameters β and σ2. Typically this is done using the method of maximum likelihood, which in the case of normal errors is equivalent (by the Gauss-Markov theorem) to the method of least squares. The factual accuracy of this article is disputed. ... Maximum likelihood estimation (MLE) is a popular statistical method used to make inferences about parameters of the underlying probability distribution from a given data set. ... This article is not about Gauss-Markov processes. ... In regression analysis, least squares, also known as ordinary least squares analysis, is a method for linear regression that determines the values of unknown quantities in a statistical model by minimizing the sum of the residuals (the difference between the predicted and observed values) squared. ...

Contents

Assumptions

Multivariate normal errors

Often one takes the components of the vector of errors to be independent and normally distributed, giving Y a multivariate normal distribution with mean Xβ and co-variance matrix σ2 I, where I is the identity matrix. Having observed the values of X and Y, the statistician must estimate β and σ2. The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields. ... In probability theory and statistics, a multivariate normal distribution, also sometimes called a multivariate Gaussian distribution, is a specific probability distribution, which can be thought of as a generalization to higher dimensions of the one-dimensional normal distribution (also called a Gaussian distribution). ... In linear algebra, the identity matrix of size n is the n-by-n square matrix with ones on the main diagonal and zeros elsewhere. ...


Rank of X

We usually assume that X is of full rank p, which allows us to invert the p × p matrix X^{top} X. The essence of this assumption is that the parameters are not linearly dependent upon one another, which would make little sense in a linear model. This also ensures the model is identifiable. In linear algebra, the column rank (row rank respectively) of a matrix A with entries in some field is defined to be the maximal number of columns (rows respectively) of A which are linearly independent. ... The factual accuracy of this article is disputed. ... In mathematics, the identifiability condition is defined as which says that if a function evaluates the same, then the arguments must be the same. ...


karthik


Methods of inference

Maximum likelihood

β

The log-likelihood function (for εi independent and normally distributed) is Look up likelihood in Wiktionary, the free dictionary. ...

l(beta, sigma^2; Y) = -frac{n}{2} log (2 pi sigma^2) - frac{1}{2sigma^2} sum_{i=1}^n left(Y_i - x_i^{top} beta right)^2

where x_i^{top} is the ith row of X. Differentiating with respect to βj, we get

frac{partial l}{partial beta_j} = frac{1}{sigma^2} sum_{i=1}^n x_{ij} left( Y_i - x_i^{top} beta right)

so setting this set of p equations to zero and solving for β gives

X^{top} X hat{beta} = X^{top} Y.

Now, using the assumption that X has rank p, we can invert the matrix on the left hand side to give the maximum likelihood estimate for β: In statistics the linear model is given by where Y is an n×1 column vector of random variables, X is an n×p matrix of known (i. ... Maximum likelihood estimation (MLE) is a popular statistical method used to make inferences about parameters of the underlying probability distribution from a given data set. ...

 hat{beta} = (X^{top} X)^{-1} X^{top} Y.

We can check that this is a maximum by looking at the Hessian matrix of the log-likelihood function. In mathematics, the Hessian matrix is the square matrix of second order partial derivatives of a function. ...


σ2

By setting the right hand side of

 frac{partial l}{partial sigma^2} = -frac{n}{2sigma^2} + frac{1}{2 sigma^4} sum_{i=1}^n left(Y_i - x_i^{top} beta right)^2

to zero and solving for σ2 we find that

 hat{sigma}^2 = frac{1}{n} sum_{i=1}^n left(Y_i - x_i^{top} hat{beta} right)^2 = frac{1}{n} | Y - X hat{beta} |^2.

Accuracy of maximum likelihood estimation

Since we have that Y follows a multivariate normal distribution with mean Xβ and co-variance matrix σ2 I, we can deduce the distribution of the MLE of β: In probability theory and statistics, a multivariate normal distribution, also sometimes called a multivariate Gaussian distribution, is a specific probability distribution, which can be thought of as a generalization to higher dimensions of the one-dimensional normal distribution (also called a Gaussian distribution). ...

 hat{beta} = (X^{top} X)^{-1} X^{top} Y sim N_p (beta, (X^{top}X)^{-1} sigma^2 ).

So this estimate is unbiased for β, and we can show that this variance achieves the Cramér-Rao bound. In statistics, an estimator is a function of the observable sample data that is used to estimate an unknown population parameter; an estimate is the result from the actual application of the function to a particular set of data. ... In statistics, the Cramér-Rao bound (CRB) or Cramér-Rao lower bound (CRLB), named in honor of Harald Cramér and Calyampudi Radhakrishna Rao, expresses a lower bound on the variance of estimators of a deterministic parameter. ...


A more complicated argument[1] shows that

since a chi-squared distribution with n − p degrees of freedom has mean n − p, this is only asymptotically unbiased. For any positive integer , the chi-square distribution with k degrees of freedom is the probability distribution of the random variable where Z1, ..., Zk are independent normal variables, each having expected value 0 and variance 1. ...


Generalizations

Generalized least squares

If, rather than taking the variance of ε to be σ2I, where I is the n×n identity matrix, one assumes the variance is σ2M, where M is a known matrix other than the identity matrix, then one estimates β by the method of "generalized least squares", in which, instead of minimizing the sum of squares of the residuals, one minimizes a different quadratic form in the residuals — the quadratic form being the one given by the matrix M−1: If is a vector of random variables, and is an -dimensional square matrix, then the scalar quantity is known as a quadratic form in . ...

{min_{beta}}left(y-Xbetaright)'M^{-1}left(y-Xbetaright)

This has the effect of "de-correlating" normal errors, and leads to the estimator

widehat{beta}=left(X'M^{-1}Xright)^{-1}X'M^{-1}y

which is the best linear unbiased estimator for β. If all of the off-diagonal entries in the matrix M are 0, then one normally estimates β by the method of weighted least squares, with weights proportional to the reciprocals of the diagonal entries. In statistics, the Gauss–Markov theorem, named after Carl Friedrich Gauss and Andrey Markov, states that in a linear model in which the errors have expectation zero and are uncorrelated and have equal variances, the best linear unbiased estimators of the coefficients are the least-squares estimators. ... Weighted least squares is a method of regression, similar to least squares in that it uses the same minimization of the sum of the residuals: However, instead of weighting all points equally, they are weighted such that points with a greater weight contribute more to the fit: Often, wi is...


Generalized linear models

Generalized linear models, for which rather than In statistics, the generalized linear model (GLM) is a useful generalization of ordinary least squares regression. ...

E(Y) = Xβ,

one has

g(E(Y)) = Xβ,

where g is the "link function". The variance is also not restricted to being normal.


An example is the Poisson regression model, which states that In statistics, the Poisson regression model attributes to a response variable Y a Poisson distribution whose expected value depends on a predictor variable x (written in lower case because the model treats x as non-random, in the following way: (where log means natural logarithm). ...

Yi has a Poisson distribution with expected value eγ+δxi.

The link function is the natural logarithm function. Having observed xi and Yi for i = 1, ..., n, one can estimate γ and δ by the method of maximum likelihood. The natural logarithm, formerly known as the hyperbolic logarithm, is the logarithm to the base e, where e is an irrational constant approximately equal to 2. ... Maximum likelihood estimation (MLE) is a popular statistical method used to make inferences about parameters of the underlying probability distribution from a given data set. ...


General linear model

The general linear model (or multivariate regression model) is a linear model with multiple measurements per object. Each object may be represented in a vector. The general linear model (GLM) is a statistical, linear model. ...


See also

  • ANOVA, or analysis of variance, is historically a precursor to the development of linear models. Here the model parameters themselves are not computed, but X column contributions and their significance are identified using the ratios of within-group variances to the error variance and applying the F test.
  • Linear regression
  • Robust regression

  Results from FactBites:
 
PlanetMath: regression model (472 words)
Some well known non-normal regression models are the logistic regression for binary data and the Poisson regression for count data.
Linear regression models belong to a more general class of statistical models called the general linear model, where explanatory variables are no longer restricted to be continuous ones only.
This is version 7 of regression model, born on 2004-07-29, modified 2006-09-24.
PlanetMath: general linear model (340 words)
A linear regression model is a special case of the general linear model where all explanatory variables are assumed to be continuous.
Analysis of variance model, or ANOVA, is another special case of the general linear model, where all of the explantory variables are categorical in nature (for example, gender, marital status, etc..).
This is version 3 of general linear model, born on 2004-08-03, modified 2006-09-18.
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.