FACTOID # 125: India’s criminal courts acquitted over a million defendants in 1999, more than the next 48 surveyed countries combined.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Golomb coding

Golomb coding is a form of entropy encoding invented by Solomon W. Golomb that is optimal for alphabets following geometric distributions, that is, when small values are vastly more common than large values. If negative values are present in the input stream then a overlap and interleave scheme is used, where all non-negative inputs are mapped to even numbers (x^prime=2|x|=2x, xge0), and all negative numbers are mapped to odd numbers (). An entropy encoding is a coding scheme that assigns codes to symbols so as to match code lengths with the probabilities of the symbols. ... Solomon W. Golomb Solomon Wolf Golomb (b. ... In probability theory and statistics, the geometric distribution is either of two discrete probability distributions: the probability distribution of the number X of Bernoulli trials needed to get one success, supported on the set { 1, 2, 3, ...}, or the probability distribution of the number Y = X âˆ’ 1 of failures before...


Rice coding is a special case of Golomb coding first described by, and independently invented by, Robert F. Rice. It is equivalent to Golomb coding where the tunable parameter is a power of two. This makes it extremely efficient for use on computers, since the division operation becomes a bitshift operation and the remainder operation becomes a bitmask operation. A BlueGene supercomputer cabinet. ... In computer programming, a bitwise operation operates on one or two bit patterns or binary numerals at the level of their individual bits. ... In computer science, a mask is some data that, along with an operation, are used in order to extract information stored elsewhere. ...


The FLAC audio codec uses Rice coding to represent linear prediction residues. FLAC, an acronym for Free Lossless Audio Codec, is a popular file format for audio data compression. ... Audio compression can mean two things: Audio data compression - in which the amount of data in a recorded waveform is reduced for transmission. ...

Contents

Overview

Golomb coding uses a tunable parameter M to divide an input value into two parts: q, the result of a division by M, and r, the remainder. The quotient is sent in unary coding, followed by the remainder in truncated binary encoding. When M = 1 Golomb coding is equivalent to unary coding. Unary coding is an entropy encoding that represents a number n with n-1 ones followed by a zero. ... Truncated binary encoding is an entropy encoding typically used for uniform probability distributions with a finite alphabet. ...


Image:golomb_rice_code.png Image File history File links Golomb_rice_code. ...


Golomb-Rice codes can be thought of as codes that indicate a number by the position of the bin (q), and the offset within the bin (r). The above figure shows the position q, and offset r for the encoding of integer N using Golomb-Rice parameter M.


Formally, the two parts are given by the following expression, where x is the number being encoded: q = left lfloor frac{left (x-1 right )}{M} right rfloor and r = xqM − 1 The final result looks like: left (q+1 right ) r


Note that r can be of a varying number of bits, and is specifically only b bits for Rice code, and switches between b-1 and b bits for Golomb code (i.e M is not a power of 2): Let b = lceillog_2(M)rceil. If 0 leq r < 2^b-M, then use b-1 bits to encode r. If 2^b-M leq r < M, then use b bits to encode r. Clearly, b = log2(M) if M is a power of 2 and we can encode all values of r with b bits.


The parameter M is a function of the corresponding Bernoulli process, which is parameterized by p = P(X = 0) the probability of success in a given Bernoulli Trial. M and p are related by these inequalities: In probability and statistics, a Bernoulli process is a discrete-time stochastic process consisting of a sequence of independent random variables taking values over two letters. ... In the theory of probability and statistics, a Bernoulli trial is an experiment whose outcome is random and can be either of two possible outcomes, called success and failure. ...



The Golomb code for this distribution is equivalent to the Huffman code for the same probabilities, if it were possible to compute the Huffman code. In computer science, Huffman coding is an entropy encoding algorithm used for data compression that finds the optimal system of encoding strings based on the relative frequency of each character. ...


Simple Algorithm

  1. Fix the parameter M to an integer value.
  2. For N, the number to be encoded, find
    1. quotient = q = int[N/M]
    2. remainder = r = N%M
  3. Generate Codeword
    1. The Code format : <Quotient Code><Remainder Code>, where
    2. Quotient Code
      1. Write quotient in unary coding
    3. Remainder Code
      1. If M is power of 2, code remainder as binary format. So log2(M) bits are needed. (Rice code)
      2. If M is not a power of 2, set b = lceillog_2(M)rceil
        1. Code the first 2bM values of r as plain binary using b-1 bits.
        2. Code the rest of the values of r by coding the number r + 2bM in plain binary representation using b bits.

Unary coding is an entropy encoding that represents a number n with n-1 ones followed by a zero. ...

Example

Set M = 10. Thus b = lceillog_2(10)rceil = 4. The cutoff is 2bM = 16 − 10 = 6

r 0 1 2 3 4 5 6 7 8 9
code 000 001 010 011 100 101 1100 1101 1110 1111
q 0 1 2 3 ...
code 0 10 110 1110 ...

42 would be encoded as qcode(q),rcode(r) = qcode(4),rcode(2) = 11110,010 (the comma isn't necessary, because the 0 at the end of the q code is enough to say when q ends and r begins). Look up forty-two in Wiktionary, the free dictionary. ...




== REQUESTING BETTER EXAMPLE FOR RICE CODE ==





Example code

Encode

 void golombEncode(char* source, char* dest, int M) { IntReader intreader(source); BitWriter bitwriter(dest); int num; int c = 0; while(intreader.hasLeft()) { num = intreader.getInt(); int q = num / M; for (int i = 0 ; i < q; i++) bitwriter.putBit(true); //write q ones bitwriter.putBit(false); //write one zero int v = 1; for (int i = 0 ; i < log2(M); i++) { bitwriter.putBit( v & num ); v = v <<1; } } bitwriter.close(); intreader.close(); } 

Decode

 void golombDecode(char* source, char* dest, int M) { BitReader bitreader(source); BitWriter bitwriter(dest); int q = 0; int nr = 0; while (bitreader.hasLeft()) { while(bitreader.getBit()) q++;//potentially dangerous with malformed files. for (int a = 0; a < log2(M); a++)//read out the sequential log2(M) bits if (bitreader.getBit()) nr += 1 << a; nr += q*M; //add the bits and the multiple of M //now write out the nr. for (int a = 0; a < 32; a++) //assuming that an int has 32 bits. { if (nr & 1 << a) { bitwriter.putBit(true); }else { bitwriter.putBit(false); } } } bitreader.close(); bitwriter.close(); } 

Applications

The FLAC audio codec uses Rice coding to represent linear prediction residues.
Apple's ALAC audio codec uses modfied Rice coding after its Adaptive FIR filter. FLAC, an acronym for Free Lossless Audio Codec, is a popular file format for audio data compression. ... Audio compression can mean two things: Audio data compression - in which the amount of data in a recorded waveform is reduced for transmission. ... Apple Inc. ... Apple Lossless (also known as Apple Lossless Encoder, ALE, or Apple Lossless Audio Codec, ALAC) is an audio codec developed by Apple Computer for lossless encoding of digital music. ... Audio compression can mean two things: Audio data compression - in which the amount of data in a recorded waveform is reduced for transmission. ... A finite impulse response (FIR) filter is a type of a digital filter. ...


The Golomb-Rice coder is used in the entropy coding stage of Rice Algorithm based lossless image codecs. One such experiment yields a compression ratio graph given below. See other entries in this category at the bottom of this page.


Image:Golomb_coded_Rice_Algorithm_experiment_Compression_Ratios.png Image File history File links Golomb_coded_Rice_Algorithm_experiment_Compression_Ratios. ...


References

  • Golomb, S.W. (1966). , Run-length encodings. IEEE Transactions on Information Theory, IT--12(3):399--401
  • R. F. Rice, "Some Practical Universal Noiseless Coding Techniques, " Jet Propulsion Laboratory, Pasadena, California, JPL Publication 79--22, Mar. 1979.
  • Witten, Ian Moffat, Alistair Bell, Timothy. "Managing Gigabytes: Compressing and Indexing Documents and Images." Second Edition. Morgan Kaufmann Publishers, San Francisco CA. 1999 ISBN 1-55860-570-3
  • David Salomon. "Data Compression",ISBN 0-387-95045-1.
  • http://ese.wustl.edu/class/fl06/ese578/GolombCodingNotes.pdf

  Results from FactBites:
 
Golomb-Rice Coding (450 words)
A Golomb code is variable-length code, a bit like Huffman; however, rather than being based on the data, like Huffman, it's based on a simple model of the probability of the values (which are explicitly dealt with as natural numbers, rather than being abstract symbols): small values are more likely than big ones.
A Golomb-Rice code is a Golomb code where the divisor is a power of two, enabling an efficient implementation using shifts and masks rather than division and modulo.
Kiely and Klimesh (in "Generalized Golomb codes and adaptive coding of wavelet-transformed image subbands") generalise the idea of a Golomb code, using the idea of an index function, which maps from integers to an index (which is also an integer).
Golomb coding (246 words)
Golomb coding is a form of entropy coding invented by Solomon W. Golomb that is optimal for alphabets following geometric distributions, that is, when small values are vastly more common than large values.
The quotient is sent in unary coding, followed by the remainder in truncated binary encoding.
It is equivalent to Golomb coding where the tunable parameter is a power of two.
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.