FACTOID # 148: The top ten tourist destinations France, Spain, USA, Italy, China, UK, Austria, Mexico, Germany and Canada account for 49.6 percent of all tourist arrivals worldwide.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

FACTS & STATISTICS    Simple view

  1. Select countries to view: (hold down Control key and click to select several)

     

     

    Compare:

     

     

  1. Select fact or statistic: (* = graphable)

     

     

     

  2. (OPTIONAL) Compare to statistic: (both need to be graphable)

     

     

     

  3. View result as:

     

       
(OR) SEARCH ALL encyclopedia, stats & forums:   

Encyclopedia > Principal component analysis

In statistics, principal components analysis (PCA) is a technique that can be used to simplify a dataset; more formally it is a linear transformation that chooses a new coordinate system for the data set such that the greatest variance by any projection of the data set comes to lie on the first axis (then called the first principal component), the second greatest variance on the second axis, and so on. PCA can be used for reducing dimensionality in a dataset while retaining those characteristics of the dataset that contribute most to its variance by eliminating the later principal components (by a more or less heuristic decision). These characteristics may be the "most important", but this is not necessarily the case, depending on the application.


PCA is also called the Karhunen-Loève transform (named after Kari Karhunen and Michel Loève) or the Hotelling transform (in honor of Harold Hotelling). PCA has the speciality of being the optimal linear transformation for keeping the subspace that has largest variance. However this comes at the price of greater computational requirement, e.g. if compared to the discrete cosine transform. Unlike other linear transforms, the PCA does not have a fixed set of basis vectors. Its basis vectors depend on the data set.


Assuming zero empirical mean (the empirical mean of the distribution has been subtracted away from the data set), the principal component w1 of a dataset x can be defined as:

(See arg max for the notation.) With the first k - 1 components, the k-th component can be found by subtracting the first k - 1 principal components from x:

and by substituting this as the new dataset to find a principal component in

.

A simpler way to calculate the components wi uses the empirical covariance matrix of x, the measurement vector. By finding the eigenvalues and eigenvectors of the covariance matrix, we find that the eigenvectors with the largest eigenvalues correspond to the dimensions that have the strongest correlation in the dataset. The original measurements are finally projected onto the reduced vector space. Note that the eigenvectors X are actually the columns of the matrix V, where X=ULV′ is the singular value decomposition of X.


PCA is equivalent to empirical orthogonal functions (EOF).


PCA is a popular technique in pattern recognition. However, PCA is not optimized for class separability. An alternative is the linear discriminant analysis, which does take this into account. PCA optimally minimizes reconstruction error under the L2 norm.

Contents

Algorithm details

Following is a detailed English description of PCA using the covariance method. Suppose you have n data vectors of d dimensions each, and you want to project your data into a k dimensional subspace.


Find the basis vectors

  1. Organize your data into column vectors, so you end up with a matrix, D.
  2. Find the empirical mean along each dimension, so you end up with a empirical mean vector, M.
  3. Subtract the empirical mean vector M from each column of the data matrix D. Store mean-subtracted data matrix in S.
  4. Find the empirical covariance matrix C of S. .
  5. Compute and sort by decreasing eigenvalue, the eigenvectors V of C.
  6. Save the mean vector M. Save the first k columns of V as P. P will have dimension .

Projecting new data

Suppose you have a d×1 data vector D. Then the k×1 projected vector is v = PT(D − M).


Derivation of PCA using the covariance method

Let X be a d-dimensional random vector expressed as column vector. Without loss of generality, assume X has zero empirical mean. We want to find a orthonormal projection matrix P such that

with the constraint that

is a diagonal matrix and .

By substitution, and matrix algebra, we get

.

We now have

.

Rewrite P as d column vectors, so

and as

.

Substituting into equation above, we get

.

Notice that in , Pi is an eigenvector of Xs covariance matrix. Therefore, by finding the eigenvectors of X′s covariance matrix, we find a projection matrix P that satisfies the original constraints.


Correspondence analysis

Correspondence analysis is conceptually similar to PCA, but scales the data (which must be positive) so that rows and columns are treated equivalently. It is traditionally applied to contingency tables where Pearson's chi-square test has shown a relationship between rows and columns.


See also


  Results from FactBites:
 
PlanetMath: principal components analysis (330 words)
The principal axes and the variance along each of them are then given by the eigenvectors and associated eigenvalues of the dispersion matrix.
Principal component analysis has in practice been used to reduce the dimensionality of problems, and to transform interdependent coordinates into significant and independent ones.
This is version 3 of principal components analysis, born on 2002-01-19, modified 2006-05-29.
NationMaster - Encyclopedia: Principal components analysis (566 words)
PCA involves the computation of the eigenvalue decomposition or Singular value decomposition of a data set, usually after mean centering the data for each attribute.
Principal components analysis is a technique for finding a set of weighted linear composites of original variables such that each composite (a principal component) is uncorrelated with the others.
PCA can be used for dimensionality reduction in a dataset while retaining those characteristics of the dataset that contribute most to its variance, by keeping lower-order principal components and ignoring higher-order ones.
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.