A logarithmic scale bar. Picking a random x position on this number line, roughly 1/3 of the time the first digit of the number will be 1 (the widest band of each power of ten). Benford's law, also called the first-digit law, states that in lists of numbers from many real-life sources of data, the leading digit is 1 almost one third of the time, and larger numbers occur as the leading digit with less and less frequency as they grow in magnitude, to the point that 9 is the first digit less than one time in twenty. This is based on the observation that real-world measurements are generally distributed logarithmically, thus the logarithm of a set of real-world measurements is generally distributed uniformly. Image File history File links No higher resolution available. ...
Image File history File links No higher resolution available. ...
A logarithmic scale is a scale of measurement that uses the logarithm of a physical quantity instead of the quantity itself. ...
This counter-intuitive result applies to a wide variety of figures, including electricity bills, street addresses, stock prices, population numbers, death rates, lengths of rivers, physical and mathematical constants, and processes described by power laws (which are very common in nature). Even more counter-intuitively, the result holds regardless of the base in which the numbers are expressed, although the exact proportions of course change. In science, a physical constant is a physical quantity whose numerical value does not change. ...
A mathematical constant is a quantity, usually a real number or a complex number, that arises naturally in mathematics and does not change. ...
See Also: Watt In physics, a power law relationship between two scalar quantities x and y is any such that the relationship can be written as where a (the constant of proportionality) and k (the exponent of the power law) are constants. ...
The radix (Latin for root), also called base, is the number of various unique symbols (or digits or numerals) a positional numeral system uses to represent numbers. ...
It is named after physicist Frank Benford, who stated it in 1938, although it had been previously stated by Simon Newcomb in 1881 in his paper "Note on the Frequency of Use of the Different Digits in Natural Numbers". The first rigorous formulation and proof appears to be due to Theodore P. Hill in 1988.[1] Frank Albert Benford, Jr. ...
See also: Other events of 1938 List of years in science . ...
Simon Newcomb. ...
See also: Other events of 1881 List of years in science . ...
Ted Hill (Dr. Theodore P. Hill) is the mathematician who developed Benfords law into a powerful tool used to seek out tax fraud. ...
See also: Other events of 1988 List of years in science . ...
Mathematical statement More precisely, Benford's law states that the leading digit d (d ∈ {1, …, b − 1} ) in base b (b ≥ 2) occurs with probability proportional to logb(d + 1) − logb(d) = logb((d + 1) / d) = logb(1 + 1 / d). This quantity is exactly the space between d and d + 1 in a log scale. Probability is the likelihood that something is the case or will happen. ...
A logarithmic scale is a scale of measurement that uses the logarithm of a physical quantity instead of the quantity itself. ...
In base 10, the leading digits have the following distribution by Benford's law, where d is the leading digit and p the probability: The decimal (base ten or occasionally denary) numeral system has ten as its base. ...
| d | p | | 1 | 30.1% | | 2 | 17.6% | | 3 | 12.5% | | 4 | 9.7% | | 5 | 7.9% | | 6 | 6.7% | | 7 | 5.8% | | 8 | 5.1% | | 9 | 4.6% | One can also formulate a law for the first two digits: the probability that the first two-digit block is equal to n (n = 10, …, 99) is log100(n+1) − log100(n), and similarly for three-blocks without leading zeros and longer blocks (in fact, the BL for the first p digits in base b follows immediately from the BL for a single leading digit in base bp).
Explanation The law can be explained by the fact that, if it is indeed true that the first digits have a particular distribution, it must be independent of the measuring units used. For example, this means that if one converts from e.g. feet to yards (multiplication by a constant), the distribution must be unchanged — it is scale invariant, and the only distribution that fits this is logarithmic. In mathematics and statistics, a probability distribution, more properly called a probability density, assigns to every interval of the real numbers a probability, so that the probability axioms are satisfied. ...
This article is about a foot as a unit of length. ...
A yard (abbreviation: yd) is the name of a unit of length in a number of different systems, including English units, Imperial units, and United States customary units. ...
In physics, scale invariance is the feature of physical objects of laws that do not change if the space is magnified, i. ...
In probability and statistics, the logarithmic distribution (also known as the logarithmic series distribution) is a discrete probability distribution. ...
Frequency of first significant digit of physical constants plotted against Benford's Law. For example, the first (non-zero) digit of the lengths or distances of objects should have the same distribution whether the unit of measurement is feet, yards, or anything else. But there are three feet in a yard, so the probability that the first digit of a length in yards is 1 must be the same as the probability that the first digit of a length in feet starts 3, 4, or 5. Applying this to all possible measurement scales gives a logarithmic distribution, and combined with the fact that log10(1)=0 and log10(10)=1 gives Benford's law. That is, if there is a distribution of first digits, it must apply to a set of data regardless of what measuring units are used, and the only distribution of first digits that fits that is the Benford Law. Image File history File links This is a lossless scalable vector image. ...
Image File history File links This is a lossless scalable vector image. ...
Look up length, width, breadth in Wiktionary, the free dictionary. ...
Distance is a numerical description of how far apart objects are at any given moment in time. ...
More precisely, let X be a random variable whose probability of being equal to any positive integer, x, is proportional to x−s, where s > 1. That is, - .
The constant of proportionality must then be 1/ζ(s), where ζ is the Riemann zeta function (see zeta distribution). The probability that the first digit of X is n approaches log10(n + 1) − log10(n) as s approaches 1. In mathematics, the Riemann zeta function, named after German mathematician Bernhard Riemann, is a function of significant importance in number theory, because of its relation to the distribution of prime numbers. ...
In probability theory and statistics, the zeta distribution is a discrete probability distribution. ...
The precise form of Benford's law can be explained if one assumes that the logarithms of the numbers are uniformly distributed; this means that a number is for instance just as likely to be between 100 and 1000 (logarithm between 2 and 3) as it is between 10,000 and 100,000 (logarithm between 4 and 5). For many sets of numbers, especially ones that grow exponentially such as incomes and stock prices, this is a reasonable assumption. In mathematics, exponential growth (or geometric growth) occurs when the growth rate of a function is always proportional to the functions current size. ...
Note that for numbers drawn from many distributions, for example IQ scores, human heights or other variables following normal distributions, the law is not valid. However, if one "mixes" numbers from those distributions, for example by taking numbers from newspaper articles, Benford's law reappears. This can be proven mathematically: if one repeatedly "randomly" chooses a probability distribution and then randomly chooses a number according to that distribution, the resulting list of numbers will obey Benford's law (Hill, 1998). The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields. ...
In mathematics and statistics, a probability distribution is a function of the probabilities of a mutually exclusive and exhaustive set of events. ...
Applications and limitations In 1972, Hal Varian suggested that the law could be used to detect possible fraud in lists of socio-economic data submitted in support of public planning decisions. Based on the plausible assumption that people who make up figures tend to distribute their digits fairly uniformly, a simple comparison of first-digit frequency distribution from the data with the expected distribution according to Benford's law ought to show up any anomalous results. Following this idea, John Nye and Charles Moul studied international macroeconomic statistics.[2] They found that while most of them (eg World Bank international GDP data) conform to the law, it is not the case for GDP figures of developing countries suggesting a manipulation of the original data. Hal Varian is a professor and former dean at the University of California-Berkeley School of Information. ...
In the same vein, Benford's law can be (and is) used to analyse insurance, accounting or expenses data and identify possible fraud as well as pricing strategies (el Sehity, Hoelzl & Kirchler, 2005) . Other uses, for example to analyse the results of clinical trials and election results, have also been proposed.
Limitations Care must be taken with these applications, however. A set of real-life data may not obey the law, depending on the extent to which the distribution of numbers it contains are skewed by the category of data. For instance, one might expect a list of numbers representing 'populations of UK villages beginning with A' or 'small insurance claims' to obey Benford's law. But if it turns out that the definition of a 'village' is 'settlement with population between 300 and 999', or that the definition of a 'small insurance claim' is 'claim between $50 and $100', then Benford's law would not apply because certain numbers have been excluded by the definition.
History The discovery of this fact goes back to 1881, when the American astronomer Simon Newcomb noticed that in logarithm books (used at that time to perform calculations), the earlier pages (which contained numbers that started with 1) were much more worn than the other pages. It has been argued that any book that is used from the beginning would show more wear and tear on the earlier pages, but also that Newcomb would have been referring to dirt on the pages themselves (rather than the edges) where people ran their fingers down the lists of digits to find the closest number to the one they required. Simon Newcomb. ...
Logarithms to various bases: is to base e, is to base 10, and is to base 1. ...
However, logarithm books did contain more than one list, with both logarithms and antilogarithms present, and sometimes many other tables as well, including exponentials, roots, sines, cosines, tangents, secants, cosecants etc. Thus, this story may be apocryphal. However, Newcomb's published result is the first known instance of this observation and includes a distribution on the second digit, as well. Newcomb proposed a law that the probability of a single number being the first digit of a number (let such a first digit be N) was equal to log(N+1) Apocrypha (from the Greek word , meaning those having been hidden away[1]) are texts of uncertain authenticity or writings where the authorship is questioned. ...
The phenomenon was rediscovered in 1938 by the physicist Frank Benford, who checked it on a wide variety of data sets and was credited for it. In 1996, Ted Hill proved the result about mixed distributions mentioned above. Frank Albert Benford, Jr. ...
Ted Hill (Dr. Theodore P. Hill) is the mathematician who developed Benfords law into a powerful tool used to seek out tax fraud. ...
Popular culture Benford's law was used as a plot device on CBS's TV series NUMB3RS in the episode "The Running Man". CBS Broadcasting, Inc. ...
Numb3rs (Numbers; officially NUMB3RS) is an American television show produced by brothers Ridley Scott and Tony Scott. ...
References Year 2007 (MMVII) is the current year, a common year starting on Monday of the Gregorian calendar and the AD/CE era. ...
Frank Albert Benford, Jr. ...
Proceedings of the American Philosophical Society is a quarterly philosophy journal published by the American Philosophical Society since 1838. ...
American Scientist (ISSN 0003-0996) is an illustrated bimonthly magazine about science and technology. ...
Simon Newcomb. ...
American Journal of Mathematics, April 2006 issue. ...
Hal Varian is a professor and former dean at the University of California-Berkeley School of Information. ...
The American Statistician (TAS), published quarterly by the American Statistical Association. ...
A digital object identifier (or DOI) is a standard for persistently identifying a piece of intellectual property on a digital network and associating it with related data, the metadata, in a structured extensible way. ...
External links |