FACTOID # 91: In the Maldives, there are more than 2 jails for every 1000 people.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

FACTS & STATISTICS    Simple view

  1. Select countries to view: (hold down Control key and click to select several)

     

     

    Compare:

     

     

  1. Select fact or statistic: (* = graphable)

     

     

     

  2. (OPTIONAL) Compare to statistic: (both need to be graphable)

     

     

     

  3. View result as:

     

       
(OR) SEARCH ALL encyclopedia, stats & forums:   

Encyclopedia > Redundancy (information theory)

Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message. Data compression is a way to eliminate unwanted redundancy, while checksums are a way of adding desired redundancy for purposes of error correction when communicating over a noisy channel of limited capacity. To meet Wikipedias quality standards, this article or section may require cleanup. ... In computer science and information theory, data compression or source coding is the process of encoding information using fewer bits (or other information-bearing units) than an unencoded representation would use through use of specific encoding schemes. ... A checksum is a form of redundancy check, a very simple measure for protecting the integrity of data by detecting errors in data that is sent through space (telecommunications) or time (storage). ... In computer science and information theory, error correction consists of using methods to detect and/or correct errors in the transmission or storage of data by the use of some amount of redundant data and (in the case of transmission) the selective retransmission of incorrect segments of the data. ... Channel capacity, is the amount of discrete information that can be reliably transmitted over a channel. ...


Quantitative Definition

Recall that the rate of a source of information is (in the most general case) To meet Wikipedias quality standards, this article or section may require cleanup. ...

r=mathbb E H(M_t|M_{t-1},M_{t-2},M_{t-3}, cdots),

the expected, or average, conditional entropy per message (i.e. per unit time) given all the previous messages generated. It is common in information theory to speak of the "rate" or "entropy" of a language. This is appropriate, for example, when the source of information is English prose. The rate of a memoryless source is simply H(M), since by definition there is no interdependence of the successive messages of a memoryless source.


The absolute rate of a language or source is simply

R = log |M| ,,

the logarithm of the cardinality of the message space, or alphabet. (This formula is sometimes called the Hartley function.) This is the maximum possible rate of information that can be transmitted with that alphabet. (The logarithm should be taken to a base appropriate for the unit of measurement in use.) The absolute rate is equal to the rate if and only if the source is memoryless and has a uniform distrubution. Logarithms to various bases: is to base e, is to base 10, and is to base 1. ... In mathematics, the cardinality of a set is a measure of the number of elements of the set. There are two approaches to cardinality – one which compares sets directly using bijections, injections, and surjections, and another which uses cardinal numbers. ... The Hartley function is a measure of uncertainty, introduced by Hartley in 1928. ...


The absolute redundancy can then be defined as

D = R - r ,,

the difference between the rate and the absolute rate.


The quantity frac D R is called the relative redundancy and gives the maximum possible data compression ratio, when expressed as the percentage by which a file size can be decreased. (When expressed as a ratio of original file size to compressed file size, the quantity R:r gives the maximum compression ratio that can be achieved.) Complementary to the concept of relative redundancy is efficiency, defined as frac r R . A memoryless source with a uniform distribution has zero redundancy (and thus 100% efficiency), and cannot be compressed. Data compression ratio is a computer term used to quantify the reduction in data quantity produced by a data compression algorithm. ...


See also

In computer science, data compression or source coding is the process of encoding information using fewer bits, or information units, thanks to specific encoding schemes. ... In information theory, the source coding theorem (Shannon 1948) informally states that: N i. ... In computer science and information theory, data compression or source coding is the process of encoding information using fewer bits (or other information-bearing units) than an unencoded representation would use through use of specific encoding schemes. ...

References

  • Fazlollah M. Reza. An Introduction to Information Theory. New York: McGraw-Hill 1961. New York: Dover 1994. ISBN 0486682102
  • B. Schneier, Applied Cryptography: Protocols, Algorithms, and Source Code in C. New York: John Wiley & Sons, Inc. 1996. ISBN 0471117090


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.