FACTOID # 3: Andorrans live the longest, four years longer than in neighbouring France and Spain.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

FACTS & STATISTICS    Simple view

  1. Select countries to view: (hold down Control key and click to select several)

     

     

    Compare:

     

     

  1. Select fact or statistic: (* = graphable)

     

     

     

  2. (OPTIONAL) Compare to statistic: (both need to be graphable)

     

     

     

  3. View result as:

     

       
(OR) SEARCH ALL encyclopedia, stats & forums:   

Encyclopedia > Rate distortion theory

Rate distortion theory is a major branch of information theory; it addresses the problem of determining the minimal amount of entropy (or information) R that should be communicated over a channel, so that the source (input signal) can be reconstructed at the receiver (output signal) with given distortion D. To meet Wikipedias quality standards, this article or section may require cleanup. ... Ice melting - a classic example of entropy increasing In thermodynamics, thermodynamic entropy (or simply entropy) is an important state function of a thermodynamic system: that is, a property depending only on the current state of the system, independent of how that state came to be achieved. ... Information as a concept bears a diversity of meanings, from everyday usage to technical settings. ...


Rate distortion theory gives theoretical bounds for how much compression can be achieved using lossy data compression methods. Many of the existing audio, speech, image, and video compression techniques have transforms, quantization, and bit-rate allocation procedures that capitalize on the general shape of rate-distortion functions. A lossy data compression method is one where compressing data and then decompressing it retrieves data that may well be different from the original, but is close enough to be useful in some way. ...


Rate distortion theory was created by Claude Shannon in his foundational work on information theory. Claude Elwood Shannon (April 30, 1916 _ February 24, 2001) has been called the father of information theory, and was the founder of practical digital circuit design theory. ...

Contents


Introduction

In rate distortion theory, the rate is usually understood as the number of bits per data sample to be stored or transmitted. The notion of distortion is a subject of on-going discussion. In the most simple case (which is actually used in most cases), the distortion is defined as the variance of the difference between input and output signal (i.e., the mean-squared error of the difference). However, since we know that most lossy compression techniques operate on data that will be perceived by humans (listen to music, watch pictures and video) the distortion measure preferably should include some aspects of human perception. In audio compression perceptual models, and therefore perceptual distortion measures, are well developed and routinely used in compression techniques such as MP3 or Vorbis, but often not easy to include in rate-distortion theory. In image and video compression, the human perception models are less well developed and inclusion is mostly limited to the JPEG and MPEG weighing (quantization, normalization) matrix. This article is about the unit of information. ... The term input has a variety of uses in different fields. ... // Information processing In information processing, output is the process of transmitting information by an object (verb usage). ... A lossy data compression method is one where compressing a file and then decompressing it retrieves a file that may well be different to the original, but is close enough to be useful in some way. ... PSYCHOLOGY In psychology and the cognitive sciences, perception is the process of acquiring, interpreting, selecting, and organizing sensory information. ... MPEG-1 Audio Layer 3, more commonly referred to as MP3, is a popular digital audio encoding and lossy compression format invented and standardized in 1991 by a team of engineers directed by the Fraunhofer Society in Erlangen, Germany. ... Vorbis is an open and free lossy audio compression (codec) project headed by the Xiph. ... In computing, JPEG (pronounced jay-peg) is a commonly used standard method of lossy compression for photographic images. ... The Moving Picture Experts Group or MPEG is a working group of ISO/IEC charged with the development of video and audio encoding standards. ... Generally, quantization is the state of being constrained to a set of discrete values, rather than varying continuously. ... Broadly, normalization (also spelled normalisation) is any process that makes something more normal, which typically means conforming to some regularity or rule, or returning from some state of abnormality. ...


Rate-distortion functions

The functions that relate the rate and distortion are found as the solution of the following minimization problem:

inf_{Q_{Y|X}(y|x)} I_Q(Y;X) mbox{subject to} D_Q le D^*.

Here QY | X(y | x), sometimes called a test channel, is the conditional probability density function (PDF) of the communication channel output (compressed signal) Y for a given input (original signal) X, and IQ(Y ; X) is the mutual information between Y and X defined as This article defines some terms which characterize probability distributions of two or more variables. ... In mathematics, a probability density function (pdf) serves to represent a probability distribution in terms of integrals. ... In probability theory and, in particular, information theory, the mutual information, or transinformation, of two random variables is a quantity that measures the mutual dependence of the two variables. ...

I(Y;X) = H(Y) - H(Y | X)

where H(Y) and H(Y | X) are the entropy of the output signal Y and the conditional entropy of the output signal given the input signal, respectively: The conditional entropy is an entropy measure used in information theory. ...

H(Y) = int_{-infty}^{infty} P_Y (y) log_{2} (P_Y (y)),dy
H(Y|X) = int_{-infty}^{infty} int_{-infty}^{infty} Q_{Y|X}(y|x) P_X (x) log_{2} (Q_{Y|X} (y|x)), dx, dy.

The problem can also be formulated as Distortion-Rate function, where we find the supremum over achievable distortions for given rate constraint. The relevant expression is:

inf_{Q_{Y|X}(y|x)} E[D_Q[X,Y]] mbox{subject to} I_Q(Y;X)leq R

The two formulations lead to functions which are inverses of each other.


The mutual information can be understood as a measure for prior uncertainty the receiver has about the sender's signal (H(Y)), diminished by the uncertainty that is left after receiving information about the sender's signal (H(Y | X)). Of course the decrease in uncertainty is due to the communicated amount of information, which is I(Y; X).


As an example, in case there is no communication at all, then H(Y |X) = H(Y) and I(Y; X) = 0. Alternatively, if the communication channel is perfect and the received signal Y is identical to the signal X at the sender, then H(Y | X) = 0 and I(Y; X) = H(Y) = H(X).


In the definition of the rate-distortion function, DQ and D* are the distortion between X and Y for a given QY | X(y | x) and the prescribed maximum distortion, respectively. When we use the mean squared error as distortion measure, we have (for amplitude-continuous signals): In statistics the mean squared error of an estimator T of an unobservable parameter θ is i. ...

D_Q = int_{-infty}^{infty} int_{-infty}^{infty} P_{X,Y}(x,y) (x-y)^2, dx, dy = int_{-infty}^{infty} int_{-infty}^{infty} Q_{Y|X}(y|x)P_{X}(x) (x-y)^2, dx, dy

As the above equations show, calculating a rate distortion function requires the stochastic description of the input X in terms of the PDF PX(x), and then aims at finding the conditional PDF QY | X(y | x) that minimize rate for a given distortion D*. These definitions can be formulated measure-theoretically to account for discrete and mixed random variables as well.


An analytical solution to this minimization problem is often difficult to obtain except in some instances for which we next offer two of the best known examples. The rate-distortion function of any source is known to obey several fundamental properties, the most important ones being that it is a continuous, monotonically decreasing convex (U) function and thus the shape for the function in the examples is typical (even measured rate-distortion functions in real life tend to have very similar forms). In mathematics, a continuous function is a function in which arbitrarily small changes in the input produce arbitrarily small changes in the output. ... In mathematics, functions between ordered sets are monotonic (or monotone) if they preserve the given order. ... In mathematics, an object is convex if for any pair of points within the object, any point on the straight line segment that joins them is also within the object. ... Look up Function in Wiktionary, the free dictionary The word function may mean: In common parlance, a role of a component in an assembly, or of an element in a systemic aggregate (such as a person within a group). ...


Although analytical solutions to this problem are scarce, the are upper and lower bounds to these functions including the famous Shannon Lower Bound (SLB), which in the case of squared error and memoryless sources, states that for arbitrary sources with finite differential entropy, Analytic philosophy is the dominant philosophical movement of English-speaking countries. ...

R(D) ge h(X) - h(D)

where h(D) is the entropy of a Gaussian random variable with variance D. This lower bound is extensible to sources with memory and other distortion measures. One important feature of the SLB is that it is asymptotically tight in the high distortion regime for a wide class of sources and in some occasions, it actually coincides with the rate-distortion function. Shannon Lower Bounds can generally be found if the distortion between any two numbers can be expressed as a function of the difference between the value of these two numbers.


The Blahut-Arimoto algorithm is an elegant iterative technique for numerically obtaining rate-distortion functions of arbitrary finite input/output alphabet sources and much work has been done to extend it to more general problem instances. Richard Blahut is the chair of the Electrical and Computer Engineering Department at the University of Illinois at Urbana-Champaign best known for his work in information theory (e. ...


Memoryless (independent) Gaussian source

If we assume that PX(x) is Gaussian with variance σ2, and if we assume that successive samples of the signal X are stochastically independent (or, if your like, the source is memoryless, or the signal is uncorrelated), we find the following analytical expression for the rate-distortion function: The normal distribution, also called Gaussian distribution, is an extremely important probability distribution in many fields. ... In probability theory and statistics, the variance of a random variable is a measure of its statistical dispersion, indicating how far from the expected value its values typically are. ... In probability theory, to say that two events are independent intuitively means that knowing whether or not one of them occurs makes it neither more probable nor less probable that the other occurs. ... In probability theory, memorylessness is a property of certain probability distributions: the exponential distributions and the geometric distributions. ...

R(D) = left{ begin{matrix} frac{1}{2}log_2(sigma_x^2/D ), & mbox{if } D le sigma_x^2   0, & mbox{if } D > sigma_x^2 end{matrix} right.

The following figure shows what this function looks like:


Image:Rate distortion function.png Image File history File links Rate_distortion_function. ...


Rate distortion theory tell us that no compression system exists that performs outside the gray area. The closer a practical compression system is to the red (lower) bound, the better it performs. As a general rule, this bound can only be attained by increasing the coding block length parameter. Nevertheless, even at unit blocklengths one can often find good (scalar) quantizers that operate at distances from the rate-distortion function that are practically relevant.


This rate-distortion function holds only for Gaussian memoryless sources. It is known that the Gaussian source is the most "difficult" source to encode: for a given mean square error, it requires the greatest number of bits. The performance of a practical compression system working on—say—images, may well be below the R(D) lower bound shown.


Gaussian source with memory

Performance of practical compression systems

See also

In computer science, data compression or source coding is the process of encoding information using fewer bits, or information units, thanks to specific encoding schemes. ... This article is about white noise as a scientific concept, see also: White Noise (novel), a 1985 novel by Don DeLillo. ...

External links

  • VcDemo Image and Video Compression Learning Tool

  Results from FactBites:
 
Shannon's Communication Theory (391 words)
Shannon's communication channel consisted of a sender (a source of information), a transmission medium (with noise and distortion), and a receiver (whose goal is to reconstruct the sender's messages).
He showed that if the entropy rate, the amount of information you wish to transmit, excceds the channel capacity, then there were unavoidable and uncorrectable errors in the transmission.
In the study of complex systems today, we view deterministic chaotic processes as information sources and use Shannon's entropy rate, as adapted by Kolmogorov and his student Y. Sinai in the late 1950s, to measure how random a chaotic system is.
Rate distortion theory - Wikipedia, the free encyclopedia (1090 words)
Rate distortion theory is a major branch of information theory; it addresses the problem of determining the minimal amount of entropy (or information) R that should be communicated over a channel, so that the source (input signal) can be reconstructed at the receiver (output signal) with given distortion D.
In rate distortion theory, the rate is usually understood as the number of bits per data sample to be stored or transmitted.
In the most simple case (which is actually used in most cases), the distortion is defined as the variance of the difference between input and output signal (i.e., the mean-squared error of the difference).
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.