FACTOID # 134: The total area of Australia’s coral reefs is greater than the total area of any of 130 individual countries, including Slovakia, the Dominican Republic, Kuwait, Singapore, and Rwanda.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Compressed

In computer science, data compression or source coding is the process of encoding information using fewer bits, or information units, thanks to specific encoding schemes. For example, this article could be encoded with fewer bits if we accept the convention that the word "compression" is encoded as "CP!". Compression only works when both the sender and receiver of the information message have agreed on the encoding scheme.


A popular encoding scheme used on almost all modern operating systems is the ZIP file format. It can be used to reduce the size of an attachment to an e-mail, facilitating its transmission. However, both the sender and receiver must be aware of this format, and must use the appropriate encoding / decoding program.


Compression is possible because most real-world data is very redundant, or represented in its human-interpretable form in a non concise way. Compression is important because it helps reduce the consumption of expensive resources, such as disk space or connection bandwidth. However, compression requires information processing power, which can also be expensive. Therefore, many data compression schemes have been designed for various purposes. Some schemes are reversible so that the original data can be reconstructed (lossless compression), while others accept some loss of data in order to achieve higher compression (lossy compression).

Contents

Applications

One very simple means of compression, for example, is run-length encoding, wherein large runs of consecutive identical data values are replaced by a simple code with the data value and length of the run. This is an example of lossless data compression. It is often used to better use disk space on office computers, or better use the connection bandwidth in a computer network. For symbolic data such as spreadsheets, text, executable programs, etc., losslessness is essential because changing even a single bit cannot be tolerated (except in some limited cases).


In other kinds of data such as sounds and pictures, a small loss of quality can be tolerated without losing the essential nature of the data, so lossy data compression methods can be used. These frequently offer a range of compression efficiencies, where the user can choose whether he wants highly-compressed data with noticeable loss of quality or higher-quality data with less compression. In particular, compression of images and sounds can take advantage of limitations of the human sensory system to compress data in ways that are lossy, but nearly indistinguishable from the original by the human eye. This is used on CD-ROM and DVD, as well as in digital cameras.


Compression of sounds is generally called audio compression, where methods of psychoacoustics are used to remove non-audible components of the signal to make compression more efficient. Audio compression is therefore lossy compression. Different audio compression standards are listed under audio codecs. This is used in internet telephony for example.


Theory

The theoretical background of compression is provided by information theory and algorithmic information theory. Cryptography and coding theory are also closely related. The idea of data compression is deeply connected with statistical inference and particularly with the maximum likelihood principle.


Many data compression systems are best viewed with a four-stage model.


The Lempel-Ziv (LZ) compression methods are the most popular algorithms for lossless storage. DEFLATE is a variation on LZ which is optimized for decompression speed and compression ratio. Compression can be slow. DEFLATE is used in PKZIP, gzip and PNG. LZW (Lempel-Ziv-Welch) was patented by Unisys until June of 2003, and is used in GIF images. This patent is the main reason for GIF's increasing obsolescence. Also noteworthy are the LZR (LZ-Renau) methods, which serve as the basis of the Zip method. LZ methods utilize a table based compression model where table entries are substituted for redundant data. For most LZ methods, this table is generated dynamically from earlier data in the input. The table itself is often Huffman encoded (eg. SHRI, LZX). The current LZ based code that performs best is the obsolete LZX, although RAR and ACE are now coming close. LZX was purchased by Microsoft, slightly reduced in potency, and used in the CAB format.


See also

Data compression topics

Compression algorithms

Lossless data compression

Lossy data compression

  • discrete cosine transform
    • JPEG (image compression using a windowed cosine transform, then quantization, then Huffman coding)
    • MPEG (the founder of video compression and still in use today, uses DCT and delta)
      • MP3 (a part of the MPEG specification for sound and music compression, using subbanding and MDCT, which is then Huffman coded)
      • AAC (part of the MPEG specification, using MDCT and Huffman coding)
    • Ogg Vorbis (AAC-alike audio codec, designed with a focus on avoiding patent encumbrance)
  • fractal compression
  • wavelet compression
    • JPEG 2000 (image compression using wavelets, then quantization, then Huffman coding)

References

  • Timothy C. Bell, Ian Witten, John Cleary (1990) Text Compression, Prentice Hall, ISBN 0139119914

External links

  • Data Compression Benchmarks and Tests (http://www.maximumcompression.com/)
  • Compression definition @ eLook Computing (http://www.elook.org/computing/compression.htm)
  • Data Compression - Systematisation by T.Strutz (http://www-nt.e-technik.uni-rostock.de/~ts/Datacompression/compression.html)
  • Public domain article on data compression (http://www.vectorsite.net/ttdcmp1.html)

  Results from FactBites:
 
Maximum Compression (lossless data compression software) (5644 words)
Maximum Compression's goal is to show the maximum achievable data compression ratio for several filetypes (text, executable, jpeg etc).
PAQ8M is identical to PAQ8L in compression for non JPEG data, it takes #1 place on the SFC benchmark (method 2) and widens the gap to #2 (WinRK).
THOR 0.3a and QazaR both have better compression while THOR, written in Delphi(!), is also faster then previous version; it now clocks a compression speed of almost 43 MB/s in MFC test.
Audio level compression - Wikipedia, the free encyclopedia (1325 words)
Compression is used during sound recording, live sound reinforcement, and broadcasting to improve the perceived quality of audio.
Compression is used extensively in broadcasting to boost the perceived volume of the sound track while keeping it within strict limits (broadcasters in most countries have legal limits on instantaneous peak volume they may broadcast).
Compression is also often used in music production to make performances more consistent in dynamic range so that they "sit" in the mix of other instruments better and maintain consistent attention from the listener.
  More results at FactBites »


 
 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m