FACTOID # 25: If you're in Montserrat, watch your back! Nearly 1% of the population are police officers.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

FACTS & STATISTICS    Simple view

  1. Select countries to view: (hold down Control key and click to select several)

     

     

    Compare:

     

     

  1. Select fact or statistic: (* = graphable)

     

     

     

  2. (OPTIONAL) Compare to statistic: (both need to be graphable)

     

     

     

  3. View result as:

     

       
(OR) SEARCH ALL encyclopedia, stats & forums:   

Encyclopedia > Statistical analysis

Statistics is the science and practice of developing knowledge through the use of empirical data expressed in quantitative form. It is based on statistical theory which is a branch of applied mathematics. Within statistical theory, randomness and uncertainty are modelled by probability theory. Because one aim of statistics is to produce the "best" information from available data, some authors consider statistics a branch of decision theory. Statistical practice includes the planning, summarizing, and interpreting of observations, allowing for variability and uncertainty. For the scientific journal named Science, see Science (journal). ... Knowledge is the awareness and understanding of facts, truths or information gained in the form of experience or learning. ... A datum is a statement accepted at face value (a given). Data is the plural of datum. ... The theory of statistics includes a number of topics: Statistical models of the sources of data and typical problem formulation: Sampling from a finite population Measuring observational error and refining procedures Studying statistical relations Planning statistical research to measure and control observational error: Design of experiments to determine treatment effects... Mathematics, often abbreviated maths in Commonwealth English and math in American English, is the study of abstraction. ... Probability theory is the mathematical study of probability. ... Decision theory is an interdisciplinary area of study, related to and of interest to practitioners in mathematics, statistics, economics, philosophy, management and psychology. ...

Contents

Origin

The word statistics comes from the modern Latin phrase ragione de stato (reasons of state affairs), from which came the Italian word statista meaning "statesman" or "politician" (cf. status) and the German Statistik, first introduced by Gottfried Achenwall (1749), originally designating the analysis of data about the state). It acquired the meaning of the collection and classification of data generally in the early nineteenth century. It was introduced into English by Sir John Sinclair. Thus, the original principal purpose of statistics was data to be used by governmental and (often centralized) administrative bodies. The collection of data about states and localities continues, largely through national and international statistical services; in particular, censuses provide regular information about the population. Today, however, the use of statistics has broadened far beyond the service of a state or government, to include such areas as business, natural and social sciences, and medicine, among others. Latin - Wikipedia /**/ @import /skins/monobook/IE50Fixes. ... Italian is a Romance language spoken by about 70 million people, most of whom live in Italy. ... The term statesman is a respectful term used to refer to diplomats, politicians, and other notable figures of state. ... A politician is an individual involved in politics. ... Status is a state, condition or situation. ... German (called Deutsch in German; in German the term germanisch is equivalent to English Germanic), is a member of the western group of Germanic languages and is one of the worlds major languages. ... National statistical services Australia: Australian Bureau of Statistics Brazil: Brazilian Institute of Geography and Statistics (IBGE) Belgium: Statistics Belgium Canada: Statistics Canada Colombia: Departamento Administrativo Nacional de Estadistica (DANE) Denmark: Danmarks statistik - http://www. ... A census is the process of obtaining information about every member of a population (not necessarily a human population). ... In the most common sense of the word, a population is the collection of people—or organisms of a particular species—living in a given geographic area. ...


Statistical methods

We describe our knowledge (and ignorance) mathematically and attempt to learn more from whatever we can observe. This requires us to

  1. plan our observations to control their variability (experiment design),
  2. summarize a collection of observations to feature their commonality by suppressing details (descriptive statistics), and
  3. reach consensus about what the observations tell us about the world we observe (statistical inference).

In some forms of descriptive statistics, notably data mining, the second and third of these steps become so prominent that the first step (planning) appears to become less important. In these disciplines, data often are collected outside the control of the person doing the analysis, and the result of the analysis may be more an operational model than a consensus report about the world. Most scientific work starts with a question about the world we live in. ... From Latin ex- + -periri (akin to periculum attempt). ... The first statistician to consider a methodology for the design of experiments was Sir Ronald A. Fisher. ... In descriptive statistics, summary statistics are used to summarize a set of observations, in order to communicate as much as possible as simply as possible. ... Descriptive statistics is a branch of statistics that denotes any of the many techniques used to summarize a set of data. ... The topics below are usually included in the area of interpreting statistical data. ... The topics below are usually included in the area of interpreting statistical data. ... Data mining, also known as knowledge-discovery in databases (KDD), is the practice of automatically searching large stores of data for patterns. ...


Probability

The probability of an event is often defined as a number between one and zero. In reality however there is virtually nothing that has a probability of 1 or 0. You could say that the sun will certainly rise in the morning, but what if an extremely unlikely event destroys the sun? What if there is a nuclear war and the sky is covered in ash and smoke? The word probability derives from the Latin probare (to prove, or to test). ... The Sun (occasionally referred to as Sol) is the star at the centre of our solar system. ...


We often round the probability of such things up or down because they are so likely or unlikely to occur, that it's easier to recognize them as a probability of one or zero.


However, this can often lead to misunderstandings and dangerous behaviour, because people are unable to distinguish between, e.g., a probability of 10−4 and a probability of 10−9, despite the very practical difference between them. If you expect to cross the road about 105 or 106 times in your life, then reducing your risk of being run over per road crossing to 10−9 will make you safe for your whole life, while a risk per road crossing of 10−4 will make it very likely that you will have an accident, despite the intuitive feeling that 0.01% is a very small risk.


Use of prior probabilities of 0 (or 1) causes problems in Bayesian statistics, since the posterior distribution is then forced to be 0 (or 1) as well. In other words, the data is not taken into account at all! As Lindley puts it, if a coherent Bayesian attaches a prior probability of zero to the hypothesis that the Moon is made of green cheese, then even whole armies of astronauts coming back bearing green cheese cannot convince him. Lindley advocates never using prior probabilities of 0 or 1. He calls it Cromwell's rule, from a letter Oliver Cromwell wrote to the synod of the Church of Scotland on August 5th, 1650 in which he said "I beseech you, in the bowels of Christ, consider it possible that you are mistaken." Bayesian inference is statistical inference in which probabilities are interpreted not as frequencies or proportions or the like, but rather as degrees of belief. ... In Bayesian probability theory, the posterior probability is the conditional probability of some event or proposition, taking empirical data into account. ... Oliver Cromwell Oliver Cromwell ( April 25, 1599 – September 3, 1658) was an English military leader and politician. ...


Specialized disciplines

Some sciences use applied statistics so extensively that they have specialized terminology. These disciplines include: Applied statistics is the use of statistics and statistical theory in real-life situations. ... Jargon redirects here. ...

Statistics form a key basis tool in business and manufacturing as well. It is used to understand measurement systems variability, control processes (as in statistical process control or SPC), for summarizing data, and to make data-driven decisions. In these roles it is a key tool, and perhaps the only reliable tool. Biostatistics (sometimes known as biometrics, though a recent development is the use of biometrics to refer to an entirely different field), most generally, is the application of statistics to biology and, most commonly, to medicine. ... Business statistics is the science of ‘good decision making in the face of uncertainty and is used in many disciplines such as financial analysis, econometrics, auditing, production and operations including services improvement, and marketing research. ... Econometrics literally means economic measurement. It is the branch of economics that applies statistical methods to the empirical study of economic theories and relationships. ... Engineering statistics is a branch of statistics that has two subtopics which are particular to engineering: Quality control and process control use statistics as a tool to manage conformance to specifications of manufacturing processes and their products. ... Statistical physics, one of the fundamental theories of physics, uses methods of statistics in solving physical problems. ... Demography is the study of human population dynamics. ... The application of statistics to psychology. ... Social statistics is the use of statistical measurement systems to study human behavior in a social environment. ... Chemometrics is the application of mathematical or statistical methods to chemical data. ... Analytical chemistry is the analysis of material samples to gain an understanding of their chemical composition and structure. ... Chemical engineering is the application of science, mathematics and economics to the process of converting raw materials or chemicals into more useful or valuable forms. ... As with many sports, and perhaps even more so, statistics are very important to baseball. ... Statistical Process Control, or SPC is a method for achieving quality control in manufacturing processes. ...


Software

Modern statistics is supported by computers to perform some of the very large and complex calculations required.


Whole branches of statistics have been made possible by computing, for example neural networks. A neural network is an interconnected group of neurons. ...


The computer revolution has implications for the future of statistics, with a new emphasis on 'experimental' statistics.


A list of statistical packages in common use:

The R programming language, sometimes described as GNU S, is a mathematical language and environment used for statistical analysis and display. ... S is a statistical programming language developed by John Chambers of Bell Laboratories. ... MATLAB, short for matrix laboratory, refers to both the numerical computing environment and to its core programming language. ... Octave is a free computer program for performing numerical computations, which is mostly compatible with MATLAB. It is part of the GNU project. ... Microsoft Excel is a spreadsheet program written and distributed by Microsoft for computers using the Microsoft Windows operating system and Apple Macintosh computers. ... OpenOffice. ... Overview The SAS System is an integrated system of software products (provided by the SAS Institute) that enables the programmer to perform: data entry, retrieval, and management report writing and graphics statistical and mathematical analysis business planning, forecasting, and decision support operations research and project management quality improvement applications development. ... The computer program SPSS (originally, Statistical Package for the Social Sciences) was released in its first version in the 1960s, and is among the most widely used programs for statistical analysis in social science. ... For other meanings of root, see Root (disambiguation). ...

See also

In statistics, analysis of variance (ANOVA) is a collection of statistical models and their associated procedures which compare means by splitting the overall observed variance into different parts. ... Extreme value theory is a branch of statistics dealing with the extreme deviations from the median of probability distributions. ... List of associations and societies American Statistical Association Belgian Statistical Society Danish Society For Theoretical Statistics Finnish Statistical Society French Statistical Society German Statistical Society Hong Kong Statistical Society Indian Statistical Institute Institute of Mathematical Statistics International Association for Statistical Education International Biometric Society International Chinese Statistical Association International Environmetrics... National statistical services Australia: Australian Bureau of Statistics Brazil: Brazilian Institute of Geography and Statistics (IBGE) Belgium: Statistics Belgium Canada: Statistics Canada Colombia: Departamento Administrativo Nacional de Estadistica (DANE) Denmark: Danmarks statistik - http://www. ... This is a list of important publications in statistics, organized by field. ... Please add any Wikipedia articles related to statistics that are not already on this list. ... Statisticians or people who made notable contributions to the theories of statistics, or related aspects of probability, or machine learning: Peter Armitage M. S. Bartlett Thomas Bayes Yves Berger Duane Boes Ladislaus Bortkiewicz George Box Pafnuty Chebyshev Alexey Chervonenkis William Cochran (Sir) David R. Cox Richard Threlkeld Cox Harald Cram... Machine learning is an area of artificial intelligence concerned with the development of techniques which allow computers to learn. More specifically, machine learning is a method for creating computer programs by the analysis of data sets. ... Multivariate statistics or multivariate statistical analysis in statistics describes a collection of procedures which involve observation and analysis of more than one statistical variable at a time. ... This is not an attempt at a comprehensive list of statistical topics; see that article. ... The pure and simple truth is rarely pure and never simple. ... Regression analysis is any statistical method where the mean of one or more random variables is predicted conditioned on other (measured) random variables. ...

References

Lindley, D. Making Decisions. John Wiley. Second Edition 1985. ISBN 0471908088


External links

General sites and organizations

  • Statlib: Data, Software and News from the Statistics Community (Carnegie Mellon) (http://lib.stat.cmu.edu/)
  • International Statistical Institute (http://www.cbs.nl/isi/)
  • The Probability Web (http://www.mathcs.carleton.edu/probweb/probweb.html)

Link collections

Online courses and textbooks

Wikibooks has more about this subject:
Wikibooks School of Mathematics has more about this subject:

Wikibooks, previously called Wikimedia Free Textbook Project and Wikimedia-Textbooks, is a sister project to Wikipedia and is part of the Wikimedia foundation, begun on July 10, 2003. ...

Statistical software

Other resources


  Results from FactBites:
 
Inferring From Data (16052 words)
Statistical analysis has also come to be seen in many scientific disciplines as indispensable for drawing reliable conclusions from empirical results.This new field of mathematics found so extensive a domain of applications.
Statistical inference is refer to extending your knowledge obtain from a random sample from a population to the whole population.
In a statistical hypothesis test, the P value is the probability of observing a test statistic at least as extreme as the value actually observed, assuming that the null hypothesis is true.
Statistics - Wikipedia, the free encyclopedia (3171 words)
For example, the statistical significance of a trend in the data — which measures the extent to which the trend could be caused by random variation in the sample — may not agree with one's intuitive sense of its significance.
A common goal for a statistical research project is to investigate causality, and in particular to draw a conclusion on the effect of changes in the values of predictors or independent variables on a response or dependent variable.
Early statistical models were almost always from the class of linear models, but powerful computers, coupled with suitable numerical algorithms, caused a resurgence of interest in nonlinear models (especially neural networks and decision trees) and the creation of new types, such as generalised linear models and multilevel models.
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.