FACTOID # 36: Women are flooding into the workforce in many Muslim countries.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

FACTS & STATISTICS    Simple view

  1. Select countries to view: (hold down Control key and click to select several)

     

     

    Compare:

     

     

  1. Select fact or statistic: (* = graphable)

     

     

     

  2. (OPTIONAL) Compare to statistic: (both need to be graphable)

     

     

     

  3. View result as:

     

       
(OR) SEARCH ALL encyclopedia, stats & forums:   

Encyclopedia > Modified discrete cosine transform

modified discrete cosine transform (MDCT) is a Fourier-related transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where subsequent blocks are overlapped so that the last half of one block coincides with the first half of the next block. This overlapping, in addition to the energy-compaction qualities of the DCT, makes the MDCT especially attractive for signal compression applications, since it helps to avoid artifacts stemming from the block boundaries. Thus, an MDCT is employed in MP3, AC-3, Ogg Vorbis, and AAC for audio compression, for example. This is a list of linear transformations of functions related to the Fourier transform. ... 2-D DCT compared to the DFT The discrete cosine transform (DCT) is a Fourier-related transform similar to the discrete Fourier transform (DFT), but using only real numbers. ... A compression artifact (or artefact) is the result of an aggressive data compression scheme applied to an image, audio, or video that discards some data which is determined by an algorithm to be of lesser importance to the overall content but which is nonetheless discernible and objectionable to the user. ... MPEG-1 Audio Layer 3, more commonly referred to as MP3, is a popular audio encoding format. ... Description Dolby Digital is the trademark for Dolby Laboratories AC-3 audio coding system. ... This page is about the audio compression codec. ... Advanced Audio Coding (AAC) is a standardized, lossy compression and encoding scheme for digital audio. ... Audio compression can mean two things: Audio data compression - in which the amount of data in a recorded waveform is reduced for transmission. ...


(There also exists an analogous transform, the MDST, based on the discrete sine transform, as well as other, rarely used, forms of the MDCT based on different types of DCT.) In mathematics, the discrete sine transform (DST) is a Fourier-related transform similar to the discrete Fourier transform (DFT), but with one additional property: If the input consists of only real numbers, so will the output. ...


In MP3, the MDCT is not applied to the audio signal directly, but rather to the output of a 32-band polyphase quadrature filter (PQF) bank. The output of this MDCT is postprocessed by an alias reduction formula to reduce the typical aliasing of the PQF filter bank. Such a combination of a filter bank with an MDCT is called a hybrid filter bank or a subband MDCT. AAC, on the other hand, normally uses a pure MDCT; only the (rarely used) MPEG-4 AAC-SSR variant (by Sony) uses a four-band PQF bank followed by an MDCT. ATRAC uses stacked quadrature mirror filters (QMF) followed by an MDCT. A polyphase quadrature filter, or PQF is a filter bank, which splits an input signal into a given number N (mostly a power of 2) of equidistant sub-bands. ... MPEG-4 AAC-SSR or MPEG-4 Advanced Audio Coding - Scalable Sample Rate was introduced by Sony to the MPEG-4 standard. ... Sony Corporation ) is a Japanese multinational corporation and one of the worlds largest media conglomerates with revenue of $68. ... ATRAC (Adaptive TRansform Acoustic Coding) is a family of proprietary audio compression algorithms used to store information on MiniDiscs and other Sony-branded audio players. ... In digital signal processing, a quadrature mirror filter is a filter bank which splits an input signal into two bands which are usually then subsampled by a factor of 2. ...

Contents

Definition

As a lapped transform, the MDCT is a bit unusual compared to other Fourier-related transforms in that it has half as many outputs as inputs (instead of the same number). In particular, it is a linear function F : R2N -> RN (where R denotes the set of real numbers). The 2N real numbers x0, ..., x2N-1 are transformed into the N real numbers X0, ..., XN-1 according to the formula: The word linear comes from the Latin word linearis, which means created by lines. ... Graph of example function, The mathematical concept of a function expresses the intuitive idea of deterministic dependence between two quantities, one of which is viewed as primary (the independent variable, argument of the function, or its input) and the other as secondary (the value of the function, or output). A... In mathematics, the real numbers may be described informally as numbers that can be given by an infinite decimal representation, such as 2. ...

X_k = sum_{n=0}^{2N-1} x_n cos left[frac{pi}{N} left(n+frac{1}{2}+frac{N}{2}right) left(k+frac{1}{2}right) right]

(The normalization coefficient in front of this transform, here unity, is an arbitrary convention and differs between treatments. Only the product of the normalizations of the MDCT and the IMDCT, below, is constrained.)


Inverse transform

The inverse MDCT is known as the IMDCT. Because there are different numbers of inputs and outputs, at first glance it might seem that the MDCT should not be invertible. However, perfect invertibility is achieved by adding the overlapped IMDCTs of subsequent overlapping blocks, causing the errors to cancel and the original data to be retrieved; this technique is known as time-domain aliasing cancellation (TDAC).


The IMDCT transforms N real numbers X0, ..., XN-1 into 2N real numbers y0, ..., y2N-1 according to the formula:

y_n = frac{1}{N} sum_{k=0}^{N-1} X_k cos left[frac{pi}{N} left(n+frac{1}{2}+frac{N}{2}right) left(k+frac{1}{2}right) right]

(Like for the DCT-IV, an orthogonal transform, the inverse has the same form as the forward transform.)


In the case of a windowed MDCT with the usual window normalization (see below), the normalization coefficient in front of the IMDCT should be multiplied by 2 (i.e., becoming 2/N).


Computation

Although the direct application of the MDCT formula would require O(N2) operations, it is possible to compute the same thing with only O(N log N) complexity by recursively factorizing the computation, as in the fast Fourier transform (FFT). One can also compute MDCTs via other transforms, typically a DFT (FFT) or a DCT, combined with O(N) pre- and post-processing steps. Also, as described below, any algorithm for the DCT-IV immediately provides a method to compute the MDCT and IMDCT of even size. The Fast Fourier Transform (FFT) is an efficient algorithm to compute the discrete Fourier transform (DFT) and its inverse. ...


Window functions

In typical signal-compression applications, the transform properties are further improved by using a window function wn (n = 0, ..., 2N-1) that is multiplied with xn and yn in the MDCT and IMDCT formulas, above, in order to avoid discontinuities at the n = 0 and 2N boundaries by making the function go smoothly to zero at those points. (That is, we window the data before the MDCT and after the IMDCT.) In principle, x and y could have different window functions, and the window function could also change from one block to the next (especially for the case where data blocks of different sizes are combined), but for simplicity we consider the common case of identical window functions for equal-sized blocks. In signal processing, a window function (or apodization function) is a function that is zero-valued outside of some chosen interval. ...


The transform remains invertible (that is, TDAC works), for a symmetric window wn = w2N-1-n, as long as w satisfies the Princen-Bradley condition:

w_n^2 + w_{n + N}^2 = 1.

Various different window functions are common, e.g.

w_n = sin left[frac{pi}{2N} left(n+frac{1}{2}right) right]

for MP3 and MPEG-2 AAC, and

w_n = sin left( frac{pi}{2} sin^2 left[frac{pi}{2N} left(n+frac{1}{2}right) right] right)

for Vorbis. AC-3 uses a Kaiser-Bessel derived (KBD) window, and MPEG-4 AAC can also use a KBD window. The Kaiser window is a nearly optimal window function wk used for digital signal processing, and is defined by the formula: Kaiser window function for n=100 and α= 0. ...


Note that windows applied to the MDCT are different from windows used for other types of signal analysis, since they must fulfill the Princen-Bradley condition. One of the reasons for this difference is that MDCT windows are applied twice, for both the MDCT (analysis) and the IMDCT (synthesis).


Relationship to DCT-IV and Origin of TDAC

As can be seen by inspection of the definitions, for even N the MDCT is essentially equivalent to a DCT-IV, where the input is shifted by N/2 and two N-blocks of data are transformed at once. By examining this equivalence more carefully, important properties like TDAC can be easily derived.


In order to define the precise relationship to the DCT-IV, one must realize that the DCT-IV corresponds to alternating even/odd boundary conditions: even at its left boundary (around n=–1/2), odd at its right boundary (around n=N–1/2), and so on (instead of periodic boundaries as for a DFT). This follows from the identities cos[(k+1/2)(-n-1+1/2)π/N] = +cos[(k+1/2)(n+1/2)π/N] and cos[(k+1/2)(2N-n-1+1/2)π/N] = -cos[(k+1/2)(n+1/2)π/N]. Thus, if its inputs are an array x of length N, we can imagine extending this array to (x, –xR, –x, xR, ...) and so on, where xR denotes x in reverse order. In mathematics, the discrete Fourier transform (DFT), occasionally called the finite Fourier transform, is a transform for Fourier analysis of finite-domain discrete-time signals. ...


Consider an MDCT with 2n inputs and n outputs, where we divide the inputs into four blocks (a, b, c, d) each of size N/2. If we shift these by N/2 (from the +N/2 term in the MDCT definition), then (b, c, d) extend past the end of the n DCT-IV inputs, so we must "fold" them back according to the boundary conditions described above.

Thus, the MDCT of 2N inputs (a, b, c, d) is exactly equivalent to a DCT-IV of the N inputs: (–cRd, abR), where R denotes reversal as above.

(In this way, any algorithm to compute the DCT-IV can be trivially applied to the MDCT.)


Similarly, the IMDCT formula above is precisely 1/2 of the (self) inverse DCT-IV, where the output is shifted by N/2 and extended (via the boundary conditions) to a length 2N. The inverse DCT-IV would simply give back the inputs (–cRd, abR) from above. When this is shifted and extended via the boundary conditions, one obtains:

IMDCT(MDCT(a, b, c, d)) = (abR, baR, c+dR, cR+d) / 2.

(Half of the IMDCT outputs are thus redundant.)


One can now understand how TDAC works. Suppose that one computes the MDCT of the subsequent, 50% overlapped, 2N block (c, d, e, f). The IMDCT will then yield, analogous to the above: (cdR, dcR, e+fR, eR+f) / 2. When this is added with the previous IMDCT result in the overlapping half, the reversed terms cancel and one obtains simply (c, d), recovering the original data.


Origin of TDAC

The origin of the term "time-domain aliasing cancellation" is now clear. The use of input data that extend beyond the boundaries of the logical DCT-IV causes the data to be aliased in exactly the same way that frequencies beyond the Nyquist frequency are aliased to lower frequencies, except that this aliasing occurs in the time domain instead of the frequency domain. Hence the combinations cdR and so on, which have precisely the right signs for the combinations to cancel when they are added. The Nyquist frequency, named after Harry Nyquist or the Nyquist–Shannon sampling theorem, is half the sampling frequency of a discrete signal processing system. ... Properly sampled image of brick wall. ...


For odd N (which are rarely used in practice), N/2 is not an integer so the MDCT is not simply a shift permutation of a DCT-IV. In this case, the additional shift by half a sample means that the MDCT/IMDCT becomes equivalent to the DCT-III/II, and the analysis is analogous to the above.


TDAC for the windowed MDCT

Above, the TDAC property was proved for the ordinary MDCT, showing that adding IMDCTs of subsequent blocks in their overlapping half recovers the original data. The derivation of this inverse property for the windowed MDCT is only slightly more complicated.


Recall from above that when (a,b,c,d) and (c,d,e,f) are MDCTed, IMDCTed, and added in their overlapping half, we obtain (c + dR,cR + d) / 2 + (cdR,dcR) / 2 = (c,d), the original data.


Now we suppose that we multiply both the MDCT inputs and the IMDCT outputs by a window function of length 2N. As above, we assume a symmetric window function, which is therefore of the form (w,z,zR,wR) where w and z are length-N/2 vectors and R denotes reversal as before. Then the Princen-Bradley condition can be written: w^2 + z_R^2 = (1,1,ldots), with the multiplications and additions performed elementwise, or equivalently w_R^2 + z^2 = (1,1,ldots) (reversing w and z).


Therefore, instead of MDCTing (a,b,c,d), we now MDCT (wa,zb,zRc,wRd) (with all multiplications performed elementwise). When this is IMDCTed and multiplied again by the window function, the last-N half becomes:

(z_R, w_R) cdot (z_R c+w d_R, z c_R + w_R d) = (z_R^2 c+wz_R d_R, w_R z c_R + w_R^2 d).

(Note that we no longer have the multiplication by 1/2, because the IMDCT normalization differs by a factor of 2 in the windowed case.)


Similarly, the windowed MDCT and IMDCT of (c,d,e,f) yields, in its first-N half:

(w,z) cdot (w c - z_R d_R, z d - w_R c_R) = (w^2 c - w z_R d_R, z^2 d - w_R z c_R).

When we add these two halves together, we obtain:

(z_R^2 c+wz_R d_R, w_R z c_R + w_R^2 d) + (w^2 c - w z_R d_R, z^2 d - w_R z c_R)
= left( [z_R^2 + w^2] c + [wz_R - wz_R] d_R, [ w_R^2 + z^2] d + [w_R z - w_R z] c_R right) = (c, d)

recovering the original data.


References

  • Henrique S. Malvar, Signal Processing with Lapped Transforms (Artech House: Norwood MA, 1992).
  • John P. Princen and Alan B. Bradley, "Analysis/synthesis filter bank design based on time domain aliasing cancellation," IEEE Trans. Acoust. Speech Sig. Proc. ASSP-34 (5), 1153-1161 (1986).
  • A. W. Johnson and A. B. Bradley, "Adaptive transform coding incorporating time domain aliasing cancellation," Speech Comm. 6, 299-308 (1987).
  • For algorithms, see e.g.:
    • Chi-Min Liu and Wen-Chieh Lee, "A unified fast algorithm for cosine modulated filterbanks in current audio standards", J. Audio Engineering 47 (12), 1061-1075 (1999).
    • V. Britanak and K. R. Rao, "A new fast algorithm for the unified forward and inverse MDCT/MDST computation," Signal Processing 82, 433-459 (2002)
    • Vladimir Nikolajevic and Gerhard Fettweis, "Computation of forward and inverse MDCT using Clenshaw's recurrence formula," IEEE Trans. Sig. Proc. 51 (5), 1439-1444 (2003)
    • Che-Hong Chen, Bin-Da Liu, and Jar-Ferr Yang, "Recursive architectures for realizing modified discrete cosine transform and its inverse," IEEE Trans. Circuits Syst. II: Analog Dig. Sig. Proc. 50 (1), 38-45 (2003)
    • ...and references thereof.


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.