FACTOID # 91: American planes take-off a staggering 8.5 million times per year - almost half the number of take-offs worldwide.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS   

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Statistical classification

Statistical classification is a procedure in which individual items are placed into groups based on quantitative information on one or more characteristics inherent in the items (referred to as traits, variables, characters, etc) and based on a training set of previously labeled items. A training set is used in artificial intelligence, together with a supervised training method, and it consists of an input vector and an answer vector. ...


Formally, the problem can be stated as follows: given training data {(mathbf{x_1},y_1),dots,(mathbf{x_n}, y_n)} produce a classifier h:mathcal{X}rightarrowmathcal{Y} which maps an object mathbf{x} in mathcal{X} to its classification label y in mathcal{Y}. For example, if the problem is filtering spam, then mathbf{x_i} is some representation of an email and y is either "Spam" or "Non-Spam".


Statistical classification algorithms are typically used in pattern recognition systems. Pattern recognition is a field within the area of machine learning. ...


Note: in community ecology, the term "classification" is synonymous with what is commonly known (in machine learning) as clustering. See that article for more information about purely unsupervised techniques. Community ecology is the study of the distribution, abundance, demography, and interactions between coexisting populations (not precisely synonymous with population ecology). ... As a broad subfield of artificial intelligence, machine learning is concerned with the design and development of algorithms and techniques that allow computers to learn. At a general level, there are two types of learning: inductive, and deductive. ... Clustering is the classification of objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters), so that the data in each subset (ideally) share some common trait - often proximity according to some defined distance measure. ... Unsupervised learning is a method of machine learning where a model is fit to observations. ...

Contents

Statistical classification techniques

While there are many methods for classification, they all attempt to solve one of the following mathematical problems

  • The first is to find a map of a feature space (which is typically a multi-dimensional vector space) to a set of labels. This is equivalent to partitioning the feature space into regions, then assigning a label to each region. Such algorithms (e.g., the nearest neighbour algorithm) typically do not yield confidence or class probabilities, unless post-processing is applied. Another set of algorithms to solve this problem first apply unsupervised clustering to the feature space, then attempt to label each of the clusters or regions.
  • The second problem is to consider classification as an estimation problem, where the goal is to estimate a function of the form
P({rm class}|{vec x}) = fleft(vec x;vec thetaright)

where the feature vector input is vec x, and the function f is typically parameterized by some parameters vec theta. In the Bayesian approach to this problem, instead of choosing a single parameter vector vec theta, the result is integrated over all possible thetas, with the thetas weighted by how likely they are given the training data D: In mathematics, a vector space (or linear space) is a collection of objects (called vectors) that, informally speaking, may be scaled and added. ... The nearest neighbor algorithm in pattern recognition is a method for classifying phenomena based upon observable features. ... The posterior probability can be calculated by Bayes theorem from the prior probability and the likelihood function. ... Unsupervised learning is a method of machine learning where a model is fit to observations. ... Clustering can refer to Computer clustering - (in Computer science) the connection of many low-cost computers using special hardware and software such that they can be used as one larger computer. ... Estimation is the calculated approximation of a result which is usable even if input data may be incomplete, uncertain, or noisy. ... Bayesian inference is statistical inference in which probabilities are interpreted not as frequencies or proportions or the like, but rather as degrees of belief. ...

P({rm class}|{vec x}) = int fleft(vec x;vec thetaright)P(vec theta|D) dvec theta

Examples of classification algorithms include: This article defines some terms which characterize probability distributions of two or more variables. ... Bayes theorem is a result in probability theory, which gives the conditional probability distribution of a random variable A given B in terms of the conditional probability distribution of variable B given A and the marginal probability distribution of A alone. ...

An intriguing problem in pattern recognition yet to be solved is the relationship between the problem to be solved (data to be classified) and the performance of various pattern recognition algorithms (classifiers). Van der Walt and Barnard (see reference section) investigated very specific artificial data sets to determine conditions under which certain classifiers perform better and worse than others. The introduction to this article provides insufficient context for those unfamiliar with the subject matter. ... Linear discriminant analysis (LDA) and the related Fishers linear discriminant are methods used in statistics and machine learning to find the linear combination of features which best separate two or more classes of objects or events. ... Logistic regression is a statistical regression model for Bernoulli-distributed dependent variables. ... A naive Bayes classifier is a simple probabilistic classifier based on applying Bayes theorem with strong (naive) independence assumptions. ... The perceptron is a type of artificial neural network invented in 1957 at the Cornell Aeronautical Laboratory by Frank Rosenblatt. ... A quadratic classifier is used in machine learning to separate measurements of two or more classes of objects or events by a quadric surface. ... The nearest neighbor algorithm in pattern recognition is a method for classifying phenomena based upon observable features. ... Boosting is a machine learning meta-algorithm for performing supervised learning. ... In operations research, specifically in decision analysis, a decision tree is a decision support tool that uses a graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. ... A neural network is an interconnected group of neurons. ... A Bayesian network (or a belief network) is a probabilistic graphical model that represents a set of variables and their probabilistic independencies. ... Support vector machines (SVMs) are a set of related supervised learning methods used for classification and regression. ... State transitions in a hidden Markov model (example) x — hidden states y — observable outputs a — transition probabilities b — output probabilities A hidden Markov model (HMM) is a statistical model in which the system being modeled is assumed to be a Markov process with unknown parameters, and the challenge is to...


Classifier performance depends greatly on the characteristics of the data to be classified. There is no single classifier that works best on all given problems (a phenomenon that may be explained by the No-free-lunch theorem). Various empirical tests have been performed to compare classifier performance and to find the characteristics of data that determine classifier performance. Determining a suitable classifier for a given problem is however still more an art than a science. Many computational problems are solved by searching for good solutions in a space of candidates. ...


The most widely used classifiers are the Neural Network (Multi-layer Perception), Support Vector Machines, k-Nearest Neighbours, Gaussian Mixture Model, Gaussian, Naive Bayes, Decision Tree and RBF classifiers.


Evaluation

The measures Precision and Recall are popular metrics used to evaluate the quality of a classification system. Precision and Recall are two widely used measures for evaluating the quality of results in domains such as Information Retrieval and statistical classification. ...


Application domains

Computer vision is the science and technology of machines that see. ... Medical imaging designates the ensemble of techniques and processes used to create images of the human body (or parts thereof) for clinical purposes (medical procedures seeking to reveal, diagnose or examine disease) or medical science (including the study of normal anatomy and function). ... Optical character recognition, usually abbreviated to OCR, is a type of computer software designed to translate images of handwritten or typewritten text (usually captured by a scanner) into machine-editable text, or to translate pictures of characters into a standard encoding scheme representing them (e. ... This article or section is in need of attention from an expert on the subject. ... Speech recognition (in many contexts also known as automatic speech recognition, computer speech recognition or erroneously as Voice Recognition) is the process of converting a speech signal to a sequence of words, by means of an algorithm implemented as a computer program. ... Handwriting recognition is the ability of a computer to receive intelligible handwritten input. ... At Disney World, biometric measurements are taken of the fingers of multi-day pass users to ensure that the pass is used by the same person from day to day. ... Natural language processing (NLP) is a subfield of artificial intelligence and linguistics. ... Document classification/categorization is a problem in information science. ... The success of the Google search engine was mainly due to its powerful PageRank algorithm and its simple, easy-to-use interface. ... A credit score is a number that represents an estimate of an individuals financial creditworthiness as calculated by a statistical model. ...

References

  • C.M. van der Walt and E. Barnard,“Data characteristics that determine classifier performance”, in Proceedings of the Sixteenth Annual Symposium of the Pattern Recognition Association of South Africa, pp.160-165, 2006.

External links

See also


  Results from FactBites:
 
Encyclopedia of World Problems and Human Potential Project: 1.1 Overview: contextual
(3336 words)
The purpose of a statistical classification is often confused with that of a nomenclature.
But statistics is eminently a science of classification; and it is evident, on glancing at the subject cursorily, that any classification that brings together in groups diseases that have considerable affinity, or that are liable to be confounded with each other, is likely to facilitate the deduction of general principles.
Several classifications may, therefore, be used with advantage; and the physician, the pathologist, or the jurist, each from his own point of view, may legitimately classify the diseases and the causes of death in the way that he thinks best adapted to facilitate his inquiries, and to yield general results.
ICD Help (3201 words)
The International Classification of Diseases (ICD) is designed for the classification of Morbidity and Mortality information for statistical purposes, and for the indexing of hospital records by disease and operations, for data storage and retrieval.
The anatomist, for example, may desire a classification based on the part of the body affected whereas the pathologist is primarily interested in the nature of the disease process, the public health practitioner in aetiology and the clinician in the particular manifestation requiring his care.
It is this element of grouping in a statistical classification that distinguishes it from a nomenclature, a list or catalogue of approved names for morbid conditions, which must be extensive in order to accommodate all pathological conditions.
  More results at FactBites »

 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your location
Your comments
Please enter the 5-letter protection code


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.