|
Statistics is the science and practice of developing knowledge through the use of empirical data expressed in quantitative form. It is based on statistical theory which is a branch of applied mathematics. Within statistical theory, randomness and uncertainty are modelled by probability theory. Because one aim of statistics is to produce the "best" information from available data, some authors consider statistics a branch of decision theory. Statistical practice includes the planning, summarizing, and interpreting of observations, allowing for variability and uncertainty. For the scientific journal named Science, see Science (journal). ...
Knowledge is the awareness and understanding of facts, truths or information gained in the form of experience or learning. ...
A datum is a statement accepted at face value (a given). Data is the plural of datum. ...
The theory of statistics includes a number of topics: Statistical models of the sources of data and typical problem formulation: Sampling from a finite population Measuring observational error and refining procedures Studying statistical relations Planning statistical research to measure and control observational error: Design of experiments to determine treatment effects...
Mathematics, often abbreviated maths in Commonwealth English and math in American English, is the study of abstraction. ...
Probability theory is the mathematical study of probability. ...
Decision theory is an interdisciplinary area of study, related to and of interest to practitioners in mathematics, statistics, economics, philosophy, management and psychology. ...
Origin
The word statistics comes from the modern Latin phrase ragione de stato (reasons of state affairs), from which came the Italian word statista meaning "statesman" or "politician" (cf. status) and the German Statistik, first introduced by Gottfried Achenwall (1749), originally designating the analysis of data about the state). It acquired the meaning of the collection and classification of data generally in the early nineteenth century. It was introduced into English by Sir John Sinclair. Thus, the original principal purpose of statistics was data to be used by governmental and (often centralized) administrative bodies. The collection of data about states and localities continues, largely through national and international statistical services; in particular, censuses provide regular information about the population. Today, however, the use of statistics has broadened far beyond the service of a state or government, to include such areas as business, natural and social sciences, and medicine, among others. Latin - Wikipedia /**/ @import /skins/monobook/IE50Fixes. ...
Italian is a Romance language spoken by about 70 million people, most of whom live in Italy. ...
The term statesman is a respectful term used to refer to diplomats, politicians, and other notable figures of state. ...
A politician is an individual involved in politics. ...
Status is a state, condition or situation. ...
German (called Deutsch in German; in German the term germanisch is equivalent to English Germanic), is a member of the western group of Germanic languages and is one of the worlds major languages. ...
National statistical services Australia: Australian Bureau of Statistics Brazil: Brazilian Institute of Geography and Statistics (IBGE) Belgium: Statistics Belgium Canada: Statistics Canada Colombia: Departamento Administrativo Nacional de Estadistica (DANE) Denmark: Danmarks statistik - http://www. ...
A census is the process of obtaining information about every member of a population (not necessarily a human population). ...
In the most common sense of the word, a population is the collection of people—or organisms of a particular species—living in a given geographic area. ...
Statistical methods We describe our knowledge (and ignorance) mathematically and attempt to learn more from whatever we can observe. This requires us to - plan our observations to control their variability (experiment design),
- summarize a collection of observations to feature their commonality by suppressing details (descriptive statistics), and
- reach consensus about what the observations tell us about the world we observe (statistical inference).
In some forms of descriptive statistics, notably data mining, the second and third of these steps become so prominent that the first step (planning) appears to become less important. In these disciplines, data often are collected outside the control of the person doing the analysis, and the result of the analysis may be more an operational model than a consensus report about the world. Most scientific work starts with a question about the world we live in. ...
From Latin ex- + -periri (akin to periculum attempt). ...
The first statistician to consider a methodology for the design of experiments was Sir Ronald A. Fisher. ...
In descriptive statistics, summary statistics are used to summarize a set of observations, in order to communicate as much as possible as simply as possible. ...
Descriptive statistics is a branch of statistics that denotes any of the many techniques used to summarize a set of data. ...
The topics below are usually included in the area of interpreting statistical data. ...
The topics below are usually included in the area of interpreting statistical data. ...
Data mining, also known as knowledge-discovery in databases (KDD), is the practice of automatically searching large stores of data for patterns. ...
Probability The probability of an event is often defined as a number between one and zero. In reality however there is virtually nothing that has a probability of 1 or 0. You could say that the sun will certainly rise in the morning, but what if an extremely unlikely event destroys the sun? What if there is a nuclear war and the sky is covered in ash and smoke? The word probability derives from the Latin probare (to prove, or to test). ...
The Sun (occasionally referred to as Sol) is the star at the centre of our solar system. ...
We often round the probability of such things up or down because they are so likely or unlikely to occur, that it's easier to recognize them as a probability of one or zero. However, this can often lead to misunderstandings and dangerous behaviour, because people are unable to distinguish between, e.g., a probability of 10−4 and a probability of 10−9, despite the very practical difference between them. If you expect to cross the road about 105 or 106 times in your life, then reducing your risk of being run over per road crossing to 10−9 will make you safe for your whole life, while a risk per road crossing of 10−4 will make it very likely that you will have an accident, despite the intuitive feeling that 0.01% is a very small risk. Use of prior probabilities of 0 (or 1) causes problems in Bayesian statistics, since the posterior distribution is then forced to be 0 (or 1) as well. In other words, the data is not taken into account at all! As Lindley puts it, if a coherent Bayesian attaches a prior probability of zero to the hypothesis that the Moon is made of green cheese, then even whole armies of astronauts coming back bearing green cheese cannot convince him. Lindley advocates never using prior probabilities of 0 or 1. He calls it Cromwell's rule, from a letter Oliver Cromwell wrote to the synod of the Church of Scotland on August 5th, 1650 in which he said "I beseech you, in the bowels of Christ, consider it possible that you are mistaken." Bayesian inference is statistical inference in which probabilities are interpreted not as frequencies or proportions or the like, but rather as degrees of belief. ...
In Bayesian probability theory, the posterior probability is the conditional probability of some event or proposition, taking empirical data into account. ...
Oliver Cromwell Oliver Cromwell ( April 25, 1599 – September 3, 1658) was an English military leader and politician. ...
Specialized disciplines Some sciences use applied statistics so extensively that they have specialized terminology. These disciplines include: Applied statistics is the use of statistics and statistical theory in real-life situations. ...
Jargon redirects here. ...
Statistics form a key basis tool in business and manufacturing as well. It is used to understand measurement systems variability, control processes (as in statistical process control or SPC), for summarizing data, and to make data-driven decisions. In these roles it is a key tool, and perhaps the only reliable tool. Biostatistics (sometimes known as biometrics, though a recent development is the use of biometrics to refer to an entirely different field), most generally, is the application of statistics to biology and, most commonly, to medicine. ...
Business statistics is the science of ‘good decision making in the face of uncertainty and is used in many disciplines such as financial analysis, econometrics, auditing, production and operations including services improvement, and marketing research. ...
Econometrics literally means economic measurement. It is the branch of economics that applies statistical methods to the empirical study of economic theories and relationships. ...
Engineering statistics is a branch of statistics that has two subtopics which are particular to engineering: Quality control and process control use statistics as a tool to manage conformance to specifications of manufacturing processes and their products. ...
Statistical physics, one of the fundamental theories of physics, uses methods of statistics in solving physical problems. ...
Demography is the study of human population dynamics. ...
The application of statistics to psychology. ...
Social statistics is the use of statistical measurement systems to study human behavior in a social environment. ...
Chemometrics is the application of mathematical or statistical methods to chemical data. ...
Analytical chemistry is the analysis of material samples to gain an understanding of their chemical composition and structure. ...
Chemical engineering is the application of science, mathematics and economics to the process of converting raw materials or chemicals into more useful or valuable forms. ...
As with many sports, and perhaps even more so, statistics are very important to baseball. ...
Statistical Process Control, or SPC is a method for achieving quality control in manufacturing processes. ...
Software Modern statistics is supported by computers to perform some of the very large and complex calculations required. Whole branches of statistics have been made possible by computing, for example neural networks. A neural network is an interconnected group of neurons. ...
The computer revolution has implications for the future of statistics, with a new emphasis on 'experimental' statistics. A list of statistical packages in common use: The R programming language, sometimes described as GNU S, is a mathematical language and environment used for statistical analysis and display. ...
S is a statistical programming language developed by John Chambers of Bell Laboratories. ...
MATLAB, short for matrix laboratory, refers to both the numerical computing environment and to its core programming language. ...
Octave is a free computer program for performing numerical computations, which is mostly compatible with MATLAB. It is part of the GNU project. ...
Microsoft Excel is a spreadsheet program written and distributed by Microsoft for computers using the Microsoft Windows operating system and Apple Macintosh computers. ...
OpenOffice. ...
Overview The SAS System is an integrated system of software products (provided by the SAS Institute) that enables the programmer to perform: data entry, retrieval, and management report writing and graphics statistical and mathematical analysis business planning, forecasting, and decision support operations research and project management quality improvement applications development. ...
The computer program SPSS (originally, Statistical Package for the Social Sciences) was released in its first version in the 1960s, and is among the most widely used programs for statistical analysis in social science. ...
For other meanings of root, see Root (disambiguation). ...
See also In statistics, analysis of variance (ANOVA) is a collection of statistical models and their associated procedures which compare means by splitting the overall observed variance into different parts. ...
Extreme value theory is a branch of statistics dealing with the extreme deviations from the median of probability distributions. ...
List of associations and societies American Statistical Association Belgian Statistical Society Danish Society For Theoretical Statistics Finnish Statistical Society French Statistical Society German Statistical Society Hong Kong Statistical Society Indian Statistical Institute Institute of Mathematical Statistics International Association for Statistical Education International Biometric Society International Chinese Statistical Association International Environmetrics...
National statistical services Australia: Australian Bureau of Statistics Brazil: Brazilian Institute of Geography and Statistics (IBGE) Belgium: Statistics Belgium Canada: Statistics Canada Colombia: Departamento Administrativo Nacional de Estadistica (DANE) Denmark: Danmarks statistik - http://www. ...
This is a list of important publications in statistics, organized by field. ...
Please add any Wikipedia articles related to statistics that are not already on this list. ...
Statisticians or people who made notable contributions to the theories of statistics, or related aspects of probability, or machine learning: Peter Armitage M. S. Bartlett Thomas Bayes Yves Berger Duane Boes Ladislaus Bortkiewicz George Box Pafnuty Chebyshev Alexey Chervonenkis William Cochran (Sir) David R. Cox Richard Threlkeld Cox Harald Cram...
Machine learning is an area of artificial intelligence concerned with the development of techniques which allow computers to learn. More specifically, machine learning is a method for creating computer programs by the analysis of data sets. ...
Multivariate statistics or multivariate statistical analysis in statistics describes a collection of procedures which involve observation and analysis of more than one statistical variable at a time. ...
This is not an attempt at a comprehensive list of statistical topics; see that article. ...
The pure and simple truth is rarely pure and never simple. ...
Regression analysis is any statistical method where the mean of one or more random variables is predicted conditioned on other (measured) random variables. ...
References Lindley, D. Making Decisions. John Wiley. Second Edition 1985. ISBN 0471908088
External links General sites and organizations - Statlib: Data, Software and News from the Statistics Community (Carnegie Mellon) (http://lib.stat.cmu.edu/)
- International Statistical Institute (http://www.cbs.nl/isi/)
- The Probability Web (http://www.mathcs.carleton.edu/probweb/probweb.html)
Link collections Online courses and textbooks Wikibooks, previously called Wikimedia Free Textbook Project and Wikimedia-Textbooks, is a sister project to Wikipedia and is part of the Wikimedia foundation, begun on July 10, 2003. ...
Statistical software Other resources |