FACTOID # 149: Norwegians consume more than 15 times as much coffee per person as the Irish.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

FACTS & STATISTICS    Simple view

  1. Select countries to view: (hold down Control key and click to select several)

     

     

    Compare:

     

     

  1. Select fact or statistic: (* = graphable)

     

     

     

  2. (OPTIONAL) Compare to statistic: (both need to be graphable)

     

     

     

  3. View result as:

     

       
(OR) SEARCH ALL encyclopedia, stats & forums:   

Encyclopedia > Discrete cosine transform
2d DCT (type II) compared to the DFT. For both transforms, there is the magnitude of the spectrum on left and the histogram on right; both spectra are cropped to 1/4, to zoom the behaviour in the lower frequencies. The DCT concentrates most of the power on the lower frequencies.
2d DCT (type II) compared to the DFT. For both transforms, there is the magnitude of the spectrum on left and the histogram on right; both spectra are cropped to 1/4, to zoom the behaviour in the lower frequencies. The DCT concentrates most of the power on the lower frequencies.

A discrete cosine transform (DCT) expresses a sequence of finitely many data points in terms of a sum of cosine functions oscillating at different frequencies. DCTs are important to numerous applications in science and engineering, from lossy compression of audio and images (where small high-frequency components can be discarded), to spectral methods for the numerical solution of partial differential equations. The use of cosine rather than sine functions is critical in these applications: for compression, it turns out that cosine functions are much more efficient (as explained below, fewer are needed to approximate a typical signal), whereas for differential equations the cosines express a particular choice of boundary conditions. In mathematics, the discrete Fourier transform (DFT), occasionally called the finite Fourier transform, is a transform for Fourier analysis of finite-domain discrete-time signals. ... In mathematics, the trigonometric functions are functions of an angle, important when studying triangles and modeling periodic phenomena. ... For other uses, see Frequency (disambiguation). ... Original Image (lossless PNG, 60. ... This article is about a process which reduces the data rate or file size of digital audio signals. ... Image compression is the application of Data compression on digital images. ... In applied mathematics, Spectral methods are algorithms to solve certain kinds of partial differential equations numerically using some sort of Fast Fourier Transform. ... In mathematics, and in particular analysis, a partial differential equation (PDE) is an equation involving partial derivatives of an unknown function. ... In mathematics, the trigonometric functions are functions of an angle, important when studying triangles and modeling periodic phenomena. ... In the fields of communications, signal processing, and in electrical engineering more generally, a signal is any time-varying quantity. ... In mathematics, boundary conditions are imposed on the solutions of ordinary differential equations and partial differential equations, to fit the solutions to the actual problem. ...


In particular, a DCT is a Fourier-related transform similar to the discrete Fourier transform (DFT), but using only real numbers. DCTs are equivalent to DFTs of roughly twice the length, operating on real data with even symmetry (since the Fourier transform of a real and even function is real and even), where in some variants the input and/or output data are shifted by half a sample. There are eight standard DCT variants, of which four are common. This is a list of linear transformations of functions related to the Fourier transform. ... In mathematics, the discrete Fourier transform (DFT), occasionally called the finite Fourier transform, is a transform for Fourier analysis of finite-domain discrete-time signals. ... In mathematics, the real numbers may be described informally as numbers that can be given by an infinite decimal representation, such as 2. ... In mathematics, even functions and odd functions are functions which satisfy particular symmetry relations, with respect to taking additive inverses. ...


The most common variant of discrete cosine transform is the type-II DCT, which is often called simply "the DCT"; its inverse, the type-III DCT, is correspondingly often called simply "the inverse DCT" or "the IDCT". Two related transforms are the discrete sine transform (DST), which is equivalent to a DFT of real and odd functions, and the modified discrete cosine transform (MDCT), which is based on a DCT of overlapping data. In mathematics, the discrete sine transform (DST) is a Fourier-related transform similar to the discrete Fourier transform (DFT), but with one additional property: If the input consists of only real numbers, so will the output. ... modified discrete cosine transform (MDCT) is a Fourier-related transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where subsequent blocks are overlapped so that the last half...

Contents

Applications

The DCT, and in particular the DCT-II, is often used in signal and image processing, especially for lossy data compression, because it has a strong "energy compaction" property (Rao and Yip, 1990): most of the signal information tends to be concentrated in a few low-frequency components of the DCT, approaching the Karhunen-Loève transform (which is optimal in the decorrelation sense) for signals based on certain limits of Markov processes. As explained below, this stems from the boundary conditions implicit in the cosine functions. In statistics, principal components analysis (PCA) is a technique that can be used to simplify a dataset; more formally it is a linear transformation that chooses a new coordinate system for the data set such that the greatest variance by any projection of the data set comes to lie on... It has been suggested that this article or section be merged with Markov property. ...

DCT-II (bottom) compared to the DFT (middle) of an input signal (top).
DCT-II (bottom) compared to the DFT (middle) of an input signal (top).

A related transform, the modified discrete cosine transform, or MDCT (based on the DCT-IV), is used in AAC, Vorbis, WMA, and MP3 audio compression. Image File history File links Example_dft_dct. ... Image File history File links Example_dft_dct. ... In mathematics, the discrete Fourier transform (DFT), occasionally called the finite Fourier transform, is a transform for Fourier analysis of finite-domain discrete-time signals. ... modified discrete cosine transform (MDCT) is a Fourier-related transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where subsequent blocks are overlapped so that the last half... Advanced Audio Coding (AAC) is a standardized, lossy compression and encoding scheme for digital audio. ... Vorbis is an open source, lossy audio codec project headed by the Xiph. ... Windows Media Audio (WMA) is an audio data compression technology developed by Microsoft. ... For other uses, see MP3 (disambiguation). ...


DCTs are also widely employed in solving partial differential equations by spectral methods, where the different variants of the DCT correspond to slightly different even/odd boundary conditions at the two ends of the array.


DCTs are also closely related to Chebyshev polynomials, and fast DCT algorithms (below) are used in Chebyshev approximation of arbitrary functions by series of Chebyshev polynomials, for example in Clenshaw–Curtis quadrature. In mathematics the Chebyshev polynomials, named after Pafnuty Chebyshev, are a sequence of orthogonal polynomials which are related to de Moivres formula and which are easily defined recursively, like Fibonacci or Lucas numbers. ... In mathematics, approximation theory is concerned with how functions can best be approximated with simpler functions, and with quantitatively characterising the errors introduced thereby. ...


JPEG

Main article: JPEG: Discrete cosine transform

The DCT is used in JPEG image compression, MJPEG, MPEG, and DV video compression. There, the two-dimensional DCT-II of N times N blocks are computed and the results are quantized and entropy coded. In this case, N is typically 8 and the DCT-II formula is applied to each row and column of the block. The result is an 8 × 8 transform coefficient array in which the (0,0) element (top-left) is the DC (zero-frequency) component and entries with increasing vertical and horizontal index values represent higher vertical and horizontal spatial frequencies. JPG redirects here. ... JPG redirects here. ... Motion JPEG (M-JPEG) is an informal name for multimedia formats where each video frame or interlaced field of a digital video sequence is separately compressed as a JPEG image. ... The Moving Picture Experts Group or MPEG is a working group of ISO/IEC charged with the development of video and audio encoding standards. ... A MiniDV Camcorder For other uses, see DV (disambiguation). ... Quantized signal Digital signal In digital signal processing, quantization is the process of approximating a continuous range of values (or a very large set of possible discrete values) by a relatively-small set of discrete symbols or integer values. ... In information theory an entropy encoding is a data compression scheme that assigns codes to symbols so as to match code lengths with the probabilities of the symbols. ...


Informal overview

Like any Fourier-related transform, discrete cosine transforms (DCTs) express a function or a signal in terms of a sum of sinusoids with different frequencies and amplitudes. Like the discrete Fourier transform (DFT), a DCT operates on a function at a finite number of discrete data points. The obvious distinction between a DCT and a DFT is that the former uses only cosine functions, while the latter uses both cosines and sines (in the form of complex exponentials). However, this visible difference is merely a consequence of a deeper distinction: a DCT implies different boundary conditions than the DFT or other related transforms. Sine redirects here. ... Sine waves of various frequencies; the lower waves have higher frequencies than those above. ... For quantum-mechanical amplitude, see probability amplitude. ... In mathematics, the discrete Fourier transform (DFT), occasionally called the finite Fourier transform, is a transform for Fourier analysis of finite-domain discrete-time signals. ... Eulers formula, named after Leonhard Euler (pronounced oiler), is a mathematical formula in the subfield of complex analysis that shows a deep relationship between the trigonometric functions and the complex exponential function. ... In mathematics, boundary conditions are imposed on the solutions of ordinary differential equations and partial differential equations, to fit the solutions to the actual problem. ...


The Fourier-related transforms that operate on a function over a finite domain, such as the DFT or DCT or a Fourier series, can be thought of as implicitly defining an extension of that function outside the domain. That is, once you write a function f(x) as a sum of sinusoids, you can evaluate that sum at any x, even for x where the original f(x) was not specified. The DFT, like the Fourier series, implies a periodic extension of the original function. A DCT, like a cosine transform, implies an even extension of the original function. In mathematics, the domain of a function is the set of all input values to the function. ... The Fourier series is a mathematical tool used for analyzing periodic functions by decomposing such a function into a weighted sum of much simpler sinusoidal component functions sometimes referred to as normal Fourier modes, or simply modes for short. ... In mathematics, a periodic function is a function that repeats its values after some definite period has been added to its independent variable. ... It has been suggested that this article or section be merged with Sine transform. ... In mathematics, even functions and odd functions are functions which satisfy particular symmetry relations, with respect to taking additive inverses. ...

Illustration of the implicit even/odd extensions of DCT input data, for N=11 data points (red dots), for the four most common types of DCT (types I-IV).
Illustration of the implicit even/odd extensions of DCT input data, for N=11 data points (red dots), for the four most common types of DCT (types I-IV).

However, because DCTs operate on finite, discrete sequences, two issues arise that do not for the continuous cosine transform. First, one has to specify whether the function is even or odd at both the left and right boundaries of the domain (i.e. the min-n and max-n boundaries in the definitions below, respectively). Second, one has to specify around what point the function is even or odd. In particular, consider a sequence abcd of four equally spaced data points, and say that we specify an even left boundary. There are two sensible possibilities: either the data is even about the sample a, in which case the even extension is dcbabcd, or the data is even about the point halfway between a and the previous point, in which case the even extension is dcbaabcd (a is repeated).


These choices lead to all the standard variations of DCTs and also discrete sine transforms (DSTs). Each boundary can be either even or odd (2 choices per boundary) and can be symmetric about a data point or the point halfway between two data points (2 choices per boundary), for a total of 2times2times2times2=16 possibilities. Half of these possibilities, those where the left boundary is even, correspond to the 8 types of DCT; the other half are the 8 types of DST. In mathematics, the discrete sine transform (DST) is a Fourier-related transform similar to the discrete Fourier transform (DFT), but with one additional property: If the input consists of only real numbers, so will the output. ...


These different boundary conditions strongly affect the applications of the transform, and lead to uniquely useful properties for the various DCT types. Most directly, when using Fourier-related transforms to solve partial differential equations by spectral methods, the boundary conditions are directly specified as a part of the problem being solved. Or, for the MDCT (based on the type-IV DCT), the boundary conditions are intimately involved in the MDCT's critical property of time-domain aliasing cancellation. In a more subtle fashion, the boundary conditions are responsible for the "energy compaction" properties that make DCTs useful for image and audio compression, because the boundaries affect the rate of convergence of any Fourier-like series. In mathematics, a partial differential equation (PDE) is a relation involving an unknown function of several independent variables and its partial derivatives with respect to those variables. ... In applied mathematics, Spectral methods are algorithms to solve certain kinds of partial differential equations numerically using some sort of Fast Fourier Transform. ... The modified discrete cosine transform (MDCT) is a Fourier-related transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where subsequent blocks are overlapped so that the last...


In particular, it is well known that any discontinuities in a function reduce the rate of convergence of the Fourier series, so that more sinusoids are needed to represent the function with a given accuracy. The same principle governs the usefulness of the DFT and other transforms for signal compression: the smoother a function is, the fewer terms in its DFT or DCT are required to represent it accurately, and the more it can be compressed. (Here, we think of the DFT or DCT as approximations for the Fourier series or cosine series of a function, respectively, in order to talk about its "smoothness.") However, the implicit periodicity of the DFT means that discontinuities usually occur at the boundaries: any random segment of a signal is unlikely to have the same value at both the left and right boundaries. (A similar problem arises for the DST, in which the odd left boundary condition implies a discontinuity for any function that does not happen to be zero at that boundary.) In contrast, a DCT where both boundaries are even always yields a continuous extension at the boundaries (although the slope is generally discontinuous). This is why DCTs, and in particular DCTs of types I, II, V, and VI (the types that have two even boundaries) generally perform better for signal compression than DFTs and DSTs. In practice, a type-II DCT is usually preferred for such applications, in part for reasons of computational convenience. Continuous functions are of utmost importance in mathematics and applications. ... In numerical analysis (a branch of mathematics), the speed at which a convergent sequence approaches its limit is called the rate of convergence. ... The Fourier series is a mathematical tool used for analyzing periodic functions by decomposing such a function into a weighted sum of much simpler sinusoidal component functions sometimes referred to as normal Fourier modes, or simply modes for short. ... The Fourier series is a mathematical tool used for analyzing an arbitrary periodic function by decomposing it into a weighted sum of much simpler sinusoidal component functions sometimes referred to as normal Fourier modes, or simply modes for short. ... This article is about the mathematical term. ...


Formal definition

Formally, the discrete cosine transform is a linear, invertible function F : RN -> RN (where R denotes the set of real numbers), or equivalently an invertible N × N square matrix. There are several variants of the DCT with slightly modified definitions. The N real numbers x0, ..., xN-1 are transformed into the N real numbers X0, ..., XN-1 according to one of the formulas: For other uses, see Linear (disambiguation). ... This article is about functions in mathematics. ... In mathematics, the real numbers may be described informally as numbers that can be given by an infinite decimal representation, such as 2. ... For the square matrix section, see square matrix. ...


DCT-I

X_k = frac{1}{2} (x_0 + (-1)^k x_{N-1}) + sum_{n=1}^{N-2} x_n cos left[frac{pi}{N-1} n k right] quad quad k = 0, dots, N-1.

Some authors further multiply the x0 and xN-1 terms by √2, and correspondingly multiply the X0 and XN-1 terms by 1/√2. This makes the DCT-I matrix orthogonal, if one further multiplies by an overall scale factor of sqrt{2/(N-1)}, but breaks the direct correspondence with a real-even DFT. In matrix theory, a real orthogonal matrix is a square matrix Q whose transpose is its inverse: // Overview An orthogonal matrix is the real specialization of a unitary matrix, and thus always a normal matrix. ...


The DCT-I is exactly equivalent (up to an overall scale factor of 2), to a DFT of 2N − 2 real numbers with even symmetry. For example, a DCT-I of N=5 real numbers abcde is exactly equivalent to a DFT of eight real numbers abcdedcb (even symmetry), divided by two. (In contrast, DCT types II-IV involve a half-sample shift in the equivalent DFT.)


Note, however, that the DCT-I is not defined for N less than 2. (All other DCT types are defined for any positive N.)


Thus, the DCT-I corresponds to the boundary conditions: xn is even around n=0 and even around n=N-1; similarly for Xk.


DCT-II

X_k = sum_{n=0}^{N-1} x_n cos left[frac{pi}{N} left(n+frac{1}{2}right) k right] quad quad k = 0, dots, N-1.

The DCT-II is probably the most commonly used form, and is often simply referred to as "the DCT".


This transform is exactly equivalent (up to an overall scale factor of 2) to a DFT of 4N real inputs of even symmetry where the even-indexed elements are zero. That is, it is half of the DFT of the 4N inputs yn, where y2n = 0, y2n + 1 = xn for 0 leq n < N, and y4Nn = yn for 0 < n < 2N.


Some authors further multiply the X0 term by 1/√2 (see below for the corresponding change in DCT-III). This makes the DCT-II matrix orthogonal, if one further multiplies by an overall scale factor of sqrt{2/N}, but breaks the direct correspondence with a real-even DFT of half-shifted input. In matrix theory, a real orthogonal matrix is a square matrix Q whose transpose is its inverse: // Overview An orthogonal matrix is the real specialization of a unitary matrix, and thus always a normal matrix. ...


The DCT-II implies the boundary conditions: xn is even around n=-1/2 and even around n=N-1/2; Xk is even around k=0 and odd around k=N.


DCT-III

X_k = frac{1}{2} x_0 + sum_{n=1}^{N-1} x_n cos left[frac{pi}{N} n left(k+frac{1}{2}right) right] quad quad k = 0, dots, N-1.

Because it is the inverse of DCT-II (up to a scale factor, see below), this form is sometimes simply referred to as "the inverse DCT" ("IDCT").


Some authors further multiply the x0 term by 1/√2 (see above for the corresponding change in DCT-II), so that the DCT-II and DCT-III are transposes of one another. This makes the DCT-III matrix orthogonal, if one further multiplies by an overall scale factor of sqrt{2/N}, but breaks the direct correspondence with a real-even DFT of half-shifted output. In matrix theory, a real orthogonal matrix is a square matrix Q whose transpose is its inverse: // Overview An orthogonal matrix is the real specialization of a unitary matrix, and thus always a normal matrix. ...


The DCT-III implies the boundary conditions: xn is even around n=0 and odd around n=N; Xk is even around k=-1/2 and even around k=N-1/2.


DCT-IV

X_k = sum_{n=0}^{N-1} x_n cos left[frac{pi}{N} left(n+frac{1}{2}right) left(k+frac{1}{2}right) right] quad quad k = 0, dots, N-1.

The DCT-IV matrix becomes orthogonal if one further multiplies by an overall scale factor of sqrt{2/N}. In matrix theory, a real orthogonal matrix is a square matrix Q whose transpose is its inverse: // Overview An orthogonal matrix is the real specialization of a unitary matrix, and thus always a normal matrix. ...


A variant of the DCT-IV, where data from different transforms are overlapped, is called the modified discrete cosine transform (MDCT). modified discrete cosine transform (MDCT) is a Fourier-related transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where subsequent blocks are overlapped so that the last half...


The DCT-IV implies the boundary conditions: xn is even around n=-1/2 and odd around n=N-1/2; similarly for Xk.


DCT V-VIII

DCT types I-IV are equivalent to real-even DFTs of even order (regardless of whether N is even or odd), since the corresponding DFT is of length 2(N−1) (for DCT-I) or 4N (for DCT-II/III) or 8N (for DCT-VIII). In principle, there are actually four additional types of discrete cosine transform (Martucci, 1994), corresponding essentially to real-even DFTs of logically odd order, which have factors of N pm 1/2 in the denominators of the cosine arguments.


Equivalently, DCTs of types I-IV imply boundaries that are even/odd around either a data point for both boundaries or halfway between two data points for both boundaries. DCTs of types V-VIII imply boundaries that even/odd around a data point for one boundary and halfway between two data points for the other boundary.


However, these variants seem to be rarely used in practice. One reason, perhaps, is that FFT algorithms for odd-length DFTs are generally more complicated than FFT algorithms for even-length DFTs (e.g. the simplest radix-2 algorithms are only for even lengths), and this increased intricacy carries over to the DCTs as described below.


(The trivial real-even array, a length-one DFT (odd length) of a single number a, corresponds to a DCT-V of length N=1.)


Inverse transforms

The inverse of DCT-I is DCT-I multiplied by 2/(N-1). The inverse of DCT-IV is DCT-IV multiplied by 2/N. The inverse of DCT-II is DCT-III multiplied by 2/N (and vice versa).


Like for the DFT, the normalization factor in front of these transform definitions is merely a convention and differs between treatments. For example, some authors multiply the transforms by sqrt{2/N} so that the inverse does not require any additional multiplicative factor. Combined with appropriate factors of √2 (see above), this can be used to make the transform matrix orthogonal. In mathematics, the discrete Fourier transform (DFT), occasionally called the finite Fourier transform, is a transform for Fourier analysis of finite-domain discrete-time signals. ... In matrix theory, a real orthogonal matrix is a square matrix Q whose transpose is its inverse: // Overview An orthogonal matrix is the real specialization of a unitary matrix, and thus always a normal matrix. ...


Multidimensional DCTs

Multidimensional variants of the various DCT types follow straightforwardly from the one-dimensional definitions: they are simply a separable product (equivalently, a composition) of DCTs along each dimension.


For example, a two-dimensional DCT-II of an image or a matrix is simply the one-dimensional DCT-II, from above, performed along the rows and then along the columns (or vice versa). That is, the 2d DCT-II is given by the formula (omitting normalization and other scale factors, as above):

X_{k_1,k_2} = sum_{n_1=0}^{N_1-1} sum_{n_2=0}^{N_2-1} x_{n_1,n_2} cos left[frac{pi}{N_1} left(n_1+frac{1}{2}right) k_1 right] cos left[frac{pi}{N_2} left(n_2+frac{1}{2}right) k_2 right] .

Technically, computing a two- (or multi-) dimensional DCT by sequences of one-dimensional DCTs along each dimension is known as a row-column algorithm (after the two-dimensional case). As with multidimensional FFT algorithms, however, there exist other methods to compute the same thing while performing the computations in a different order (i.e. interleaving/combining the algorithms for the different dimensions).`` The Fast Fourier Transform (FFT) is an efficient algorithm to compute the discrete Fourier transform (DFT) and its inverse. ...

Two-dimensional DCT frequencies
Two-dimensional DCT frequencies

The image to the right shows combination of horizontal and vertical frequencies for an 8 x 8 (N1 = N2 = 8) two-dimensional DCT. Each step from left to right and top to bottom is an increase in frequency by 1/2 cycle. For example, moving right one from the top-left square yields a half-cycle increase in the horizontal frequency (goes from white to black). Another move to the right yields two half-cycles (white to black to white). A move down yields two half-cycles horizontally and a half-cycle vertically. The source data (8x8) is transformed to a linear combination of these 64 frequency squares. Image File history File links Dctjpeg. ... Image File history File links Dctjpeg. ... In mathematics, linear combinations are a concept central to linear algebra and related fields of mathematics. ...


Computation

Although the direct application of these formulas would require O(N2) operations, it is possible to compute the same thing with only O(N log N) complexity by factorizing the computation similarly to the fast Fourier transform (FFT). One can also compute DCTs via FFTs combined with O(N) pre- and post-processing steps. The Fast Fourier Transform (FFT) is an efficient algorithm to compute the discrete Fourier transform (DFT) and its inverse. ...


The most efficient algorithms, in principle, are usually those that are specialized directly for the DCT, as opposed to using an ordinary FFT plus O(N) extra operations (see below for an exception). However, even "specialized" DCT algorithms (including all of those that achieve the lowest known arithmetic counts, at least for power-of-two sizes) are typically closely related to FFT algorithms—since DCTs are essentially DFTs of real-even data, one can design a fast DCT algorithm by taking an FFT and eliminating the redundant operations due to this symmetry. This can even be done automatically (Frigo & Johnson, 2005). Algorithms based on the Cooley-Tukey FFT algorithm are most common, but any other FFT algorithm is also applicable. For example, the Winograd FFT algorithm leads to minimal-multiplication algorithms for the DFT, albeit generally at the cost of more additions, and a similar algorithm was proposed by Feig & Winograd (1992) for the DCT. Because the algorithms for DFTs, DCTs, and similar transforms are all so closely related, any improvement in algorithms for one transform will theoretically lead to immediate gains for the other transforms as well (Duhamel & Vetterli, 1990). In mathematics, a power of two is any of the nonnegative integer powers of the number two; in other words, two times itself a certain number of times. ... The Cooley-Tukey algorithm, named after J.W. Cooley and John Tukey, is the most common fast Fourier transform (FFT) algorithm. ...


While DCT algorithms that employ an unmodified FFT often have some theoretical overhead compared to the best specialized DCT algorithms, the former also have a distinct advantage: highly optimized FFT programs are widely available. Thus, in practice, it is often easier to obtain high performance for general lengths N with FFT-based algorithms. (Performance on modern hardware is typically not dominated simply by arithmetic counts, and optimization requires substantial engineering effort.) Specialized DCT algorithms, on the other hand, see widespread use for transforms of small, fixed sizes such as the 8 times 8 DCT-II used in JPEG compression, or the small DCTs (or MDCTs) typically used in audio compression. (Reduced code size may also be a reason to use a specialized DCT for embedded-device applications.) JPG redirects here. ...


In fact, even the DCT algorithms using an ordinary FFT are sometimes equivalent to pruning the redundant operations from a larger FFT of real-symmetric data, and they can even be optimal from the perspective of arithmetic counts. For example, a type-II DCT is equivalent to a DFT of size 4N with real-even symmetry whose even-indexed elements are zero. One of the most common methods for computing this via an FFT (e.g. the method used in FFTPACK and FFTW) is due to Makhoul (1980), and this method in hindsight can be seen as one step of a radix-4 decimation-in-time Cooley-Tukey algorithm applied to the "logical" real-even DFT corresponding to the DCT II. (The radix-4 step reduces the size 4N DFT to four size-N DFTs of real data, two of which are zero and two of which are equal to one another by the even symmetry, hence giving a single size-N FFT of real data plus O(N) butterflies.) Because the even-indexed elements are zero, this radix-4 step is exactly the same as a split-radix step; if the subsequent size-N real-data FFT is also performed by a real-data split-radix algorithm (as in Sorensen et al., 1987), then the resulting algorithm actually matches the lowest published arithmetic count for the power-of-two DCT-II (2Nlog2NN + 2 real-arithmetic operations[1]). So, there is nothing intrinsically bad about computing the DCT via an FFT from an arithmetic perspective—it is sometimes merely a question of whether the corresponding FFT algorithm is optimal. (As a practical matter, the function-call overhead in invoking a separate FFT routine might be significant for small N, but this is an implementation rather than an algorithmic question since it can be solved by unrolling/inlining.) FFTW, for Fastest Fourier Transform in the West, is a software library for computing discrete Fourier transforms (DFTs), developed by Matteo Frigo and Steven G. Johnson at the MIT Laboratory for Computer Science. ... Data-flow diagram connecting the inputs x (left) to the outputs y that depend on them (right) for a butterfly step of a radix-2 Cooley-Tukey FFT. This diagram resembles a butterfly (as in the Morpho butterfly shown for comparison), hence the name. ... The split-radix FFT is a fast Fourier transform (FFT) algorithm for computing the discrete Fourier transform (DFT), and was first described in an obscure paper by R. Yavne (1968) and subsequently rediscovered simultaneously by various authors in 1984. ...


Notes

  1. ^ The precise count of real arithmetic operations, and in particular the count of real multiplications, depends somewhat on the scaling of the transform definition. The 2Nlog2NN + 2 count is for the DCT-II definition shown here; two multiplications can be saved if the transform is scaled by an overall sqrt2 factor. Additional multiplications can be saved if one permits the outputs of the transform to be rescaled individually, as was shown by Arai et al. (1988) for the size-8 case used in JPEG.

References

  • N. Ahmed, T. Natarajan, and K. R. Rao, "Discrete Cosine Transform", IEEE Trans. Computers, 90-93, Jan 1974.
  • Y. Arai, T. Agui, and M. Nakajima, "A fast DCT-SQ scheme for images," Trans. IEICE 71 (11), 1095–1097 (1988).
  • P. Duhamel and M. Vetterli, "Fast Fourier transforms: a tutorial review and a state of the art," Signal Processing 19, 259–299 (1990).
  • E. Feig, S. Winograd. "Fast algorithms for the discrete cosine transform," IEEE Transactions on Signal Processing 40 (9), 2174-2193 (1992).
  • Matteo Frigo and Steven G. Johnson: FFTW, http://www.fftw.org/. A free (GPL) C library that can compute fast DCTs (types I-IV) in one or more dimensions, of arbitrary size. Also M. Frigo and S. G. Johnson, "The Design and Implementation of FFTW3," Proceedings of the IEEE 93 (2), 216–231 (2005).
  • John Makhoul, "A fast cosine transform in one and two dimensions," IEEE Trans. Acoust. Speech Sig. Proc. 28 (1), 27-34 (1980).
  • S. A. Martucci, "Symmetric convolution and the discrete sine and cosine transforms," IEEE Trans. Sig. Processing SP-42, 1038-1051 (1994).
  • A. V. Oppenheim, R. W. Schafer, and J. R. Buck, Discrete-Time Signal Processing, second edition (Prentice-Hall, New Jersey, 1999).
  • K. R. Rao and P. Yip, Discrete Cosine Transform: Algorithms, Advantages, Applications (Academic Press, Boston, 1990).
  • H. V. Sorensen, D. L. Jones, M. T. Heideman, and C. S. Burrus, "Real-valued fast Fourier transform algorithms," IEEE Trans. Acoust. Speech Sig. Processing ASSP-35, 849–863 (1987).

Image File history File links This is a lossless scalable vector image. ... GPL redirects here. ... K. R. Rao is a full professor of electrical engineering at The University of Texas at Arlington, UT Arlington. ...

See also

  • JPEG - Contains an easier to understand example of DCT transformation

JPG redirects here. ...

External links

PlanetMath is a free, collaborative, online mathematics encyclopedia. ... Source coding redirects here. ... Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. ... Not to be confused with information technology, information science, or informatics. ... Claude Shannon In information theory, the Shannon entropy or information entropy is a measure of the uncertainty associated with a random variable. ... In computer science, the Kolmogorov complexity (also known as descriptive complexity, Kolmogorov-Chaitin complexity, stochastic complexity, algorithmic entropy, or program-size complexity) of an object such as a piece of text is a measure of the computational resources needed to specify the object. ... Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message. ... In information theory an entropy encoding is a data compression scheme that assigns codes to symbols so as to match code lengths with the probabilities of the symbols. ... In computer science and information theory, Huffman coding is an entropy encoding algorithm used for lossless data compression. ... Adaptive Huffman coding is an adaptive coding technique based on Huffman coding, building the code as the symbols are being transmitted, having no initial knowledge of source distribution, that allows one-pass encoding and adaptation to changing conditions in data. ... The introduction to this article provides insufficient context for those unfamiliar with the subject matter. ... In the field of data compression, Shannon-Fano coding is a technique for constructing a prefix code based on a set of symbols and their probabilities (estimated or measured). ... Range encoding is a form of arithmetic coding, a data compression method, that is believed to be free from arithmetic coding related patents. ... Golomb coding is a form of entropy encoding invented by Solomon W. Golomb that is optimal for alphabets following geometric distributions, that is, when small values are vastly more common than large values. ... An Exponential-Golomb code (or just Exp-Golomb code) of order is a type of universal code, parameterized by a whole number . ... Fibonacci, Elias Gamma, and Elias Delta vs binary coding Rice with k=2,3,4,5,8,16 vs binary In data compression, a universal code for integers is a prefix code that maps the positive integers onto binary codewords, with the additional property that whatever the true probability distribution... Elias gamma code is a universal code encoding positive integers. ... The Fibonacci code is a universal code which encodes positive integers into binary code words. ... A dictionary coder, also sometimes known as a substitution coder, is any of a number of data compression algorithms which operate by searching for matches between the text to be compressed and a set of strings contained in a data structure (called the dictionary) maintained by the encoder. ... Run-length encoding (RLE) is a very simple form of data compression in which runs of data (that is, sequences in which the same data value occurs in many consecutive data elements) are stored as a single data value and count, rather than as the original run. ... LZ77 and LZ78 are the names for the two lossless data compression algorithms published in papers by Abraham Lempel and Jacob Ziv in 1977 and 1978. ... LZW (Lempel-Ziv-Welch) is a lossless data compression algorithm. ... LZWL is a syllable-based variant of the character-based LZW compression algorithm. ... Lempel-Ziv-Oberhumer (LZO) is a data compression algorithm that is focused on decompression speed. ... DEFLATE is a lossless data compression algorithm that uses a combination of the LZ77 algorithm and Huffman coding. ... LZX is the name of an LZ77 family compression algorithm. ... LZJB is the name for the lossless data compression algorithm invented by Jeff Bonwick to compress crash dumps and data in ZFS. LZJB source code Categories: | ... The context tree weighting method (CTW) is a lossless compression and prediction algorithm by . ... The Burrows-Wheeler transform (BWT, also called block-sorting compression), is an algorithm used in data compression techniques such as bzip2. ... PPM is an adaptive statistical data compression technique based on context modeling and prediction. ... Dynamic Markov Compression (DMC) is a lossless data compression algorithm developed by Gordon Cormack and Nigel Horspool [1]. It uses predictive arithmetic coding similar to prediction by partial matching (PPM), except that the input is predicted one bit at a time (rather than one byte at a time). ... Audio compression is a form of data compression designed to reduce the size of audio files. ... Acoustics is the interdisciplinary sciences that always deals with the study of sound, ultrasound and infrasound (all mechanical waves in gases, liquids, and solids). ... In mathematics and, in particular, functional analysis, convolution is a mathematical operator which takes two functions f and g and produces a third function that in a sense represents the amount of overlap between f and a reversed and translated version of g. ... In signal processing, sampling is the reduction of a continuous signal to a discrete signal. ... The Nyquist–Shannon sampling theorem is a fundamental result in the field of information theory, in particular telecommunications and signal processing. ... An audio codec is a computer program that compresses/decompresses digital audio data according to a given audio file format or streaming audio format. ... It has been suggested that this article or section be merged with Code Excited Linear Prediction. ... Log Area Ratios (LAR) can be used to represent Reflection Coefficients (another from for Linear Prediction Coefficients) for transmission over a channel. ... Line Spectral Pairs (LSP) are used to represent Linear Prediction Coefficients (LPC) for transmission over a channel. ... Warped Linear Predictive Coding (Warped LPC or WLPC) is a variant of Linear predictive coding in which the spectral representation of the system is modified, for example by replacing the unit delays used in an LPC implementation with first-order allpass filters. ... Code Excited Linear Prediction (CELP) is a speech coding algorithm originally proposed by M.R. Schroeder and B.S. Atal in 1985. ... Algebraic Code Excited Linear Prediction or ACELP is a speech encoding algorithm where a limited set of pulses is distributed as excitation to linear prediction filter. ... Graph of μ-law & A-law algorithms An a-law algorithm is a standard companding algorithm, used in European digital communications systems to optimize, modify, the dynamic range of an analog signal for digitizing. ... In telecommunication, a mu-law algorithm (μ-law) is a standard analog signal compression or companding algorithm, used in digital communications systems of the North American and Japanese digital hierarchies, to optimize (in other words, modify) the dynamic range of an audio analog signal prior to digitizing. ... modified discrete cosine transform (MDCT) is a Fourier-related transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where subsequent blocks are overlapped so that the last half... In mathematics, the Fourier transform is a certain linear operator that maps functions to other functions. ... Psychoacoustics is the study of subjective human perception of sounds. ... Audio level compression, also called dynamic range compression, volume compression, compression, limiting, or DRC (often seen in DVD player settings) is a process that manipulates the dynamic range of an audio signal. ... Speech coding is the compression of speech (into a code) for transmission with speech codecs that use audio signal processing and speech processing techniques. ... Sub-band coding is any form of transform coding that breaks a signal into a number of different frequency bands and encodes each one independently. ... Image compression is the application of Data compression on digital images. ... A comparison of different color spaces. ... This article is about the picture element. ... In digital image processing, chroma subsampling is the use of lower resolution for the colour (chroma) information in an image than for the brightness (intensity or luma) information. ... A compression artifact (or artefact) is the result of an aggressive data compression scheme applied to an image, audio, or video that discards some data which is determined by an algorithm to be of lesser importance to the overall content but which is nonetheless discernible and objectionable to the user. ... Run-length encoding (RLE) is a very simple form of data compression in which runs of data (that is, sequences in which the same data value occurs in many consecutive data elements) are stored as a single data value and count, rather than as the original run. ... Fractal compression is a lossy compression method used to compress images using fractals. ... Wavelet compression is a form of data compression well suited for image compression (sometimes also video compression and audio compression). ... Set Partitioning in Hierarchical Trees (SPIHT) is an image compression algorithm that exploits the inherent similarities across subbands in a wavelet decomposition of an image. ... In statistics, principal components analysis (PCA) is a technique that can be used to simplify a dataset; more formally it is a linear transformation that chooses a new coordinate system for the data set such that the greatest variance by any projection of the data set comes to lie on... In telecommunications and computing, bit rate (sometimes written bitrate) is the frequency at which bits are passing a given (physical or metaphorical) point. It is quantified using the bit per second (bit/s) unit. ... In order to intuitively test the effects of an image-processing algorithm on a natural picture a number of test images are in common use in the image-processing field. ... The phrase peak signal-to-noise ratio, often abbreviated PSNR, is an engineering term for the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation. ... Quantization, involved in image processing. ... Video compression refers to making a digital video signal use less data, without noticeably reducing the quality of the picture. ... For other uses, see Video (disambiguation). ... It has been suggested that video frame be merged into this article or section. ... The three major picture types found in typical video compression designs are I(ntra) pictures, P(redicted) pictures, and B(i-predictive) pictures (or B(i-directional) pictures). ... Video quality is a characteristic of video passed through a video processing system. ... A video codec is a device or software that enables video compression and or decompression for digital video. ... The introduction to this article provides insufficient context for those unfamiliar with the subject matter. ... Quantized signal Digital signal In digital signal processing, quantization is the process of approximating a continuous range of values (or a very large set of possible discrete values) by a relatively-small set of discrete symbols or integer values. ... A video codec is a device or software that enables video compression and or decompression for digital video. ... Rate distortion theory is the branch of information theory addressing the problem of determining the minimal amount of entropy (or information) R that should be communicated over a channel such that the source (input signal) can be reconstructed at the receiver (output signal) with given distortion D. As such, rate... Constant bit rate (CBR) is a term used in telecommunications, relating to the quality of service. ... Average bit rate refers to the average amount of data transferred per second. ... Variable bit rate (VBR) is a term used in telecommunications and computing that relates to sound or video quality. ... A timeline of events related to information theory, data compression, error correcting codes and related subjects. ...

  Results from FactBites:
 
PlanetMath: discrete cosine transform (282 words)
The discrete cosine transforms (DCT) are a family of similar transforms closely related to the discrete sine transform and the discrete Fourier transform.
This is version 14 of discrete cosine transform, born on 2002-01-13, modified 2007-08-22.
DCT for 2 dimension, summation underscript, n=1 by mbutterfield on 2005-11-01 21:19:55
Discrete cosine transform signal processor - Patent 3920974 (1682 words)
The discrete cosine transform may be interpreted as a discrete Fourier transform of a symmetrized version of the image data block.
This invention implements a discrete cosine transform of length N using only filters with 2N-1 taps, thus either reducing the filter length required or permitting a longer block to be transformed with filters of a given length.
The discrete cosine transform of an input signal may be computed in prior art by symmetrizing the input signal, storing it in a memory, and computing the discrete Fourier transform of the resultant signal.
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.