|
The mean difference is a statistical measure of dispersion and is equal to the average absolute difference of two independent values drawn from a probability distribution. A related statistic is the relative mean difference, which is the mean difference divided by the arithmetic mean. An important relationship is that the relative mean difference is equal to twice the Gini coefficient, which is defined in terms of the Lorenz curve. In descriptive statistics, statistical dispersion (also called statistical variability) is quantifiable variation of measurements of differing members of a population within the scale on which they are measured. ...
In mathematics and statistics, the arithmetic mean (or simply the mean) of a list of numbers is the sum of all the members of the list divided by the number of items in the list. ...
Graphical representation of the Gini coefficient The Gini coefficient is a measure of inequality of a distribution, defined as the ratio of area between the Lorenz curve of the distribution and the curve of the uniform distribution, to the area under the uniform distribution. ...
The Lorenz curve is a graphical representation of the cumulative distribution function of a probability distribution; it is a graph showing the proportion of the distribution assumed by the bottom y% of the values. ...
The mean difference is also known as the absolute mean difference and the Gini mean difference. The mean difference is sometimes denoted by Δ or as MD. The mean deviation is a different measure of dispersion. Corrado Gini (May 23, 1884 - March 13, 1965) was an Italian statistician, demographer and sociologist who developed the Gini coefficient, a measure of the income inequality in a society. ...
The absolute deviation of an element of a data set is the absolute difference between that element and a given point. ...
Calculation
For a population of size n, with a sequence of values yi, i = 1 to n: For a discrete probability function f(y), where yi, i = 1 to n, are the values with nonzero probabilities: In mathematics, a probability distribution is called discrete, if it is fully characterized by a probability mass function. ...
For a probability density function f(x): In mathematics, a probability density function (pdf) serves to represent a probability distribution in terms of integrals. ...
For a cumulative distribution function F(x) with inverse x(F): In probability theory, the cumulative distribution function (abbreviated cdf) completely describes the probability distribution of a real-valued random variable, X. For every real number x, the cdf is given by where the right-hand side represents the probability that the random variable X takes on a value less than...
The inverse x(F) may not exist because the cumulative density function has jump discontinuities or intervals of constant values. However, the previous formula can still apply by generalizing the definition of x(F): - x(F1) = inf {y : F(y) ≥ F1}
In mathematics the infimum of a subset of some set is the greatest element, not necessarily in the subset, that is smaller than all other elements of the subset. ...
Relative mean difference When the probability distribution has a finite and nonzero arithmetic mean, the relative mean difference, sometimes denoted by ∇ or RMD, is defined by: In mathematics and statistics, the arithmetic mean (or simply the mean) of a list of numbers is the sum of all the members of the list divided by the number of items in the list. ...
The relative mean difference quantifies the mean difference in comparison to the size of the mean and is a dimensionless quantity. The relative mean difference is equal to twice the Gini coefficient which is defined in terms of the Lorenz curve. This gives complementary perspectives to both the relative mean difference and the Gini coefficient, including alternative ways of calculating their values. Graphical representation of the Gini coefficient The Gini coefficient is a measure of inequality of a distribution, defined as the ratio of area between the Lorenz curve of the distribution and the curve of the uniform distribution, to the area under the uniform distribution. ...
The Lorenz curve is a graphical representation of the cumulative distribution function of a probability distribution; it is a graph showing the proportion of the distribution assumed by the bottom y% of the values. ...
Properties The mean difference is invariant to translations and negation, and varies proportionally to positive scaling. That is to say, if X is a random variable and c is a constant: - MD(X+c) = MD(X),
- MD(-X) = MD(X), and
- MD(c X) = |c| MD(X).
The relative mean difference is invariant to positive scaling, commutes with negation, and varies under translation in proportion to the ratio of the original and translated arithmetic means. That is to say, if X is a random variable and c is a constant: - RMD(X+c) = RMD(X) * mean(X)/(mean(X)+c) = RMD(X) / (1+c / mean(X)) for c ≠ -mean(X),
- RMD(-X) = -RMD(X), and
- RMD(c X) = RMD(X) for c > 0.
If a random variable has a positive mean, then its relative mean difference will always be greater than or equal to zero. If additionally, the random variable can only take on values that are greater or equal to zero, then its relative mean difference will be less than 2.
Compared to standard deviation Both the standard deviation and the mean difference measure dispersion -- how spread out are the values of a population or the probabilities of a distribution. The mean difference is not defined in terms of a specific measure of central tendency, whereas the standard deviation is defined in terms of the deviation from the arithmetic mean. Because the standard deviation squares its differences, it tends to give more weight to larger differences and less weight to smaller differences compared to the mean difference. When the arithmetic mean is finite, the mean difference will also be finite, even when the standard deviation is infinite. See the examples for some specific comparisons. In probability and statistics, the standard deviation of a probability distribution, random variable, or population or multiset of values is defined as the square root of the variance. ...
Sample estimators For a random sample S from a random variable X, consisting of n values yi, the statistic: is a consistent and unbiased estimator of MD(X). It has been suggested that this article or section be merged with estimation theory. ...
It has been suggested that this article or section be merged with estimation theory. ...
It has been suggested that this article or section be merged with estimation theory. ...
The statistic: is a consistent estimator of RMD(X), but is not, in general, unbiased. It has been suggested that this article or section be merged with estimation theory. ...
It has been suggested that this article or section be merged with estimation theory. ...
It has been suggested that this article or section be merged with estimation theory. ...
Confidence intervals for RMD(X) can be calculated using bootstrap sampling techniques. There does not exist, in general, an unbiased estimator for RMD(X), in part because of the difficulty of finding an unbiased estimation for multiplying by the inverse of the mean. For example, even where the sample is known to be taken from a random variable X(p) for an unknown p, and X(p) - 1 has the Bernoulli distribution, so that Pr(X(p) = 1) = 1 - p and Pr(X(p) = 2) = p, then: In probability theory and statistics, the Bernoulli distribution, named after Swiss scientist James Bernoulli, is a discrete probability distribution, which takes value 1 with success probability and value 0 with failure probability . ...
- RMD(X(p)) = 2p(1-p)/(1+p)
But the expected value of any estimator R(S) of RMD(X(p)) will be of the form: where the r i are constants. So E(R(S)) can never equal RMD(X(p)) for all p between 0 and 1.
Examples - † I z (x,y) is the regularized incomplete Beta function
In mathematics, the continuous uniform distributions are probability distributions such that all intervals of the same length are equally probable. ...
The normal distribution, also called Gaussian distribution (although Gauss was not the first to work with it), is an extremely important probability distribution in many fields. ...
In probability theory and statistics, the exponential distributions are a class of continuous probability distribution. ...
The Pareto distribution, named after the Italian economist Vilfredo Pareto, is a power law probability distribution found in a large number of real-world situations. ...
In probability theory and statistics, the gamma distribution is a continuous probability distribution. ...
In probability theory and statistics, the gamma distribution is a continuous probability distribution. ...
In probability theory and statistics, the gamma distribution is a continuous probability distribution. ...
In probability theory and statistics, the gamma distribution is a continuous probability distribution. ...
In probability theory and statistics, the gamma distribution is a continuous probability distribution. ...
In probability theory and statistics, the Bernoulli distribution, named after Swiss scientist James Bernoulli, is a discrete probability distribution, which takes value 1 with success probability and value 0 with failure probability . ...
A separate article treats the beta-function (written with a hyphen) of physics. ...
References - Xu, Kuan (January, 2004). "How Has the Literature on Gini's Index Evolved in the Past 80 Years?". Department of Economics, Dalhousie University. Retrieved on June 1, 2006.
- Gini, Corrado (1912). Variabilità e Mutabilità. Bologna: Tipografia di Paolo Cuppini.
- Gini, Corrado (1921). "Measurement of Inequality and Incomes". The Economic Journal 31: 124-126.
- Chakravarty, S. R. (1990). Ethical Social Index Numbers. New York: Springer-Verlag.
- Mills, Jeffrey A.; Zandvakili, Sourushe (1997). "Statistical Inference via Bootstrapping for Measures of Inequality". Journal of Applied Econometrics 12: 133-150.
- Lomnicki, Z. A. (1952). "The Standard Error of Gini's Mean Difference". Annals of Mathematical Statistics 23: 635-637.
- Nair, U. S. (1936). "Standard Error of Gini's Mean Difference". Biometrika 28: 428-436.
See also |