FACTOID # 136: If you're looking to invade someone by sea, try Canada, which has only 9000 Navy personnel guarding the longest national coastline in the world.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

Encyclopedia > Ordinary least squares

Least squares is a mathematical optimization technique that attempts to find a "best fit" to a set of data by attempting to minimize the sum of the squares of the differences (called residuals) between the fitted function and the data. Mathematics is the study of quantity, structure, space and change. ... In mathematics, the term optimization refers to the study of problems that have the form Given: a function f : A R from some set A to the real numbers Sought: an element x0 in A such that f(x0) ≤ f(x) for all x in A (minimization) or such that... A graph illustrating local min/max and global min/max points In mathematics, a point x* is a local maximum of a function f if there exists some ε > 0 such that f(x*) ≥ f(x) for all x with |x-x*| < ε. ... In statistics, the concepts of error and residual are easily confused with each other. ... In mathematics, a function is a relation, such that each element of a set (the domain) is associated with a unique element of another (possibly the same) set (the codomain, not to be confused with the range). ...


An implicit requirement for the least squares method to work is that errors in each measurement be randomly distributed (ideally they should come from Gaussian distribution). See also Gauss-Markov theorem. It is also important that the collected data be well chosen, so as to allow visibility into the variables to be solved for. The normal distribution, also called Gaussian distribution, is an extremely important probability distribution in many fields, especially in physics and engineering. ... This article is not about Gauss-Markov processes. ...


The least squares technique is commonly used in curve fitting. Many other optimization problems can also be expressed in a least squares form, by either minimizing energy or maximizing entropy. Regression analysis is any statistical method where the mean of one or more random variables is predicted conditioned on other (measured) random variables. ... The thermodynamic entropy S, often simply called the entropy in the context of thermodynamics, is a measure of the amount of energy in a physical system that cannot be used to do work. ...

Contents


Formulation of the problem

Suppose that the data set consists of the points (xi, yi) with i = 1, 2, ..., n. We want to find a function f such that

f(x_i)approx y_i.

To attain this goal, we suppose that the function f is of a particular form containing some parameters which need to be determined. For instance, suppose that it is quadratic, meaning that f(x) = ax2 + bx + c, where a, b and c are not yet known. We now seek the values of a, b and c that minimize the sum of the squares of the residuals: f(x) = x2 - x - 2 In mathematics, a quadratic function is a polynomial function of the form , where a is nonzero. ...

S = sum_{i=1}^n (y_i - f(x_i))^2.

This explains the name least squares.


Solving the least squares problem

In the above example, f is linear in the parameters a, b and c. The problem simplifies considerably in this case and essentially reduces to a system of linear equations. This is explained in the article on linear least squares. A linear function is a mathematical function term of the form: f(x) = m x + c where c is a constant. ... In mathematics and linear algebra, a system of linear equations is a set of linear equations such as 3x1 + 2x2 − x3 = 1 2x1 − 2x2 + 4x3 = −2 −x1 + ½x2 − x3 = 0. ... Linear least squares is a mathematical optimization technique to find an approximate solution for a system of linear equations that has no exact solution. ...


The problem is more difficult if f is not linear in the parameters to be determined. We then need to solve a general (unconstrained) optimization problem. Any algorithm for such problems, like Newton's method and gradient descent, can be used. Another possibility is to apply an algorithm that is developed especially to tackle least squares problems, for instance the Gauss-Newton algorithm or the Levenberg-Marquardt algorithm. In mathematics, the term optimization refers to the study of problems that have the form Given: a function f : A R from some set A to the real numbers Sought: an element x0 in A such that f(x0) ≤ f(x) for all x in A (minimization) or such that... Gradient descent is an optimization algorithm that approaches a local maximum of a function by taking steps proportional to the gradient (or the approximate gradient) of the function at the current point. ... In mathematics, the Gauss-Newton algorithm is used to solve nonlinear least squares problems. ... The Levenberg-Marquardt algorithm provides a numerical solution to the mathematical problem of minimizing a sum of squares of several, generally nonlinear functions that depend on a common set of parameters. ...


Least squares and regression analysis

In regression analysis, one replaces the relation Regression analysis is any statistical method where the mean of one or more random variables is predicted conditioned on other (measured) random variables. ...

f(x_i)approx y_i

by

f(x_i) = y_i + varepsilon_i,

where the noise term ε is a random variable with mean zero. Again, we distinguish between linear regression, in which case the function f is linear in the parameters to be determined (e.g., f(x) = ax2 + bx + c), and nonlinear regression. As before, linear regression is much simpler than nonlinear regression. (It is tempting to think that the reason for the name linear regression is that the graph of the function f(x) = ax + b is a line. Fitting a curve f(x) = ax2 + bx + c, estimating a, b, and c by least squares, is an instance of linear regression because the vector of least-square estimates of a, b, and c is a linear transformation of the vector whose components are f(xi) + εi.) A random variable can be thought of as the numeric result of operating a non-deterministic mechanism or performing a non-deterministic experiment to generate a random result. ... Linear regression - Wikipedia /**/ @import /skins-1. ... Nonlinear regression in statistics is the problem of fitting a model to multidimensional x,y data, where f is a nonlinear function of x with parameters θ. ... In mathematics, a linear transformation (also called linear operator or linear map) is a function between two vector spaces that preserves the operations of vector addition and scalar multiplication. ...


One frequently estimates the parameters (a, b and c in the above example) by least squares: those values are taken that minimize S. The Gauss-Markov theorem states that the least squares estimates are optimal in a certain sense, if we take f(x) = ax + b with a and b to be determined and the noise terms ε are independent and identically distributed (see the article for a more precise statement and less restrictive conditions on the noise terms). This article is not about Gauss-Markov processes. ...


External links


  Results from FactBites:
 
Least squares Summary (1107 words)
Least squares is a mathematical optimization technique which, when given a series of measured data, attempts to find a function which closely approximates the data (a "best fit").
+ bx + c, estimating a, b, and c by least squares, is an instance of linear regression because the vector of least-square estimates of a, b, and c is a linear transformation of the vector whose components are f(x
Least squares estimation for linear models is notoriously non-robust to outliers.
Partial Least Squares (PLS) (3719 words)
In short, partial least squares regression is probably the least restrictive of the various multivariate extensions of the multiple linear regression model.
For establishing the model, partial least squares regression produces a p by c weight matrix W for X such that T=XW, i.e., the columns of W are weight vectors for the X columns producing the corresponding n by c factor score matrix T.
Ordinary least squares procedures for the regression of Y on T are then performed to produce Q, the loadings for Y (or weights for Y) such that Y=TQ+E.
  More results at FactBites »

 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your location
Your comments
Please enter the 5-letter protection code


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.