|
Predictive modelling is the process by which a model is created or chosen to try to best predict the probability of an outcome. In many cases the model is chosen on the basis of detection theory to try to guess the probability of a signal given a set amount of input data, for example given an email determining how likely that it is spam. An abstract model (or conceptual model) is a theoretical construct that represents physical, biological or social processes, with a set of variables and a set of logical and quantitative relationships between them. ...
The word probability derives from the Latin probare (to prove, or to test). ...
Detection theory (or signal detection theory) is a means to quantify the ability to discern between signal and noise. ...
E-mail, or email, is short for electronic mail and is a method of composing, sending, and receiving messages over electronic communication systems. ...
View of a modern spam email, containing an advertising image. ...
Models can use one or more classifiers in trying to determine the probability of a set of data belonging to another set, say spam or 'ham'. In mathematics, a classifier is a mapping from a (discrete or continuous) feature space X to a discrete set of labels Y. Classifiers may either be fixed classifiers or learning classifiers, and learning classifiers may in turn be divided into supervised and unsupervised learning classifiers. ...
Models and classifiers Many models exist to try to predict on the basis of input data.
Classification trees Naive Bayes See main article: Naive Bayes classifier A naïve Bayes classifier (also known as Idiots Bayes) is a simple probabilistic classifier. ...
k-nearest neighbor algorithm See main article: k-nearest neighbor algorithm. In pattern recognition, the k-nearest neighbor algorithm is a method for classifying phenomena based upon observable features, similar to the nearest neighbour classification method. ...
Majority classifier Support vector machines Logistic regression Logistic regression is a technique in which unknown values of a discrete variable are predicted based on known values of one or more continuous and/or discrete variables. Logistic regresion differs from OLS regression in that the dependent variable is binary in nature. This procedure has many applications. In biostatistics, the researcher may be interested in trying to model the probability of a patient being diagnosed with a certain type of cancer based on knowing, say, the incidence of that cancer in his or her family. In business, the marketer may be interested in modeling the probability of an individual purchasing a product based on the price of that product. Both of these are examples of a simple, binary logistic model. The model is "simple" in that each has only one independent, or predictor, variable, and it is "binary" in that the dependent variable can take on only one of two values: cancer or no cancer, and purchase or does not purchase. Models are not restricted to a single independent variable or to a binary dependent variable. It has been suggested that Logit be merged into this article or section. ...
See also |