FACTOID # 55: NationMaster.com is now 40 times the size of the CIA World Factbook!
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Linear predictive coding

Linear predictive coding (LPC) is a tool used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model. It is one of the most powerful speech analysis techniques, and one of the most useful methods for encoding good quality speech at a low bit rate and provides extremely accurate estimates of speech parameters. Wikipedia does not have an article with this exact name. ... CELP stands for Code Excited Linear Prediction and is a speech coding algorithm originally proposed by M.R. Schroeder and B.S. Atal in 1984. ... This article or section does not cite its references or sources. ... Speech processing is the study of speech signals and the processing methods of these signals. ... In remote sensing using a spectrometer, the spectral envelope of a feature is the boundary of its spectral properties, as defined by the range of brightness levels in each of the spectral bands of interest. ... A digital system is one that uses discrete values (often electrical voltages), especially those representable as binary numbers, or non-numeric symbols such as letters or icons, for input, processing, transmission, storage, or display, rather than a continuous spectrum of values (ie, as in an analog system). ... In information theory, a signal is the sequence of states of a communications channel that encodes a message. ... This article or section does not cite its references or sources. ... In computer science and information theory, data compression or source coding is the process of encoding information using fewer bits (or other information-bearing units) than an unencoded representation would use through use of specific encoding schemes. ... Linear prediction is a mathematical operation where future values of a digital signal are estimated as a linear function of previous samples. ...

Contents

Overview

Main article: source-filter model of speech production

LPC starts with the assumption that a speech signal is produced by a buzzer at the end of a tube (voiced sounds), with occasional added hissing and popping sounds (sibilants and plosive sounds). Although apparently crude, this model is actually a close approximation to the reality of speech production. The glottis (the space between the vocal cords) produces the buzz, which is characterized by its intensity (loudness) and frequency (pitch). The vocal tract (the throat and mouth) forms the tube, which is characterized by its resonances, which are called formants. Hisses and pops are generated by the action of the tongue, lips and throat during sibilants and plosives. The source-filter model of speech production assumes that the vocal cords are the source of spectrally flat sound (the excitation signal), and that the vocal tract acts as a filter to spectrally shape the various sounds of speech. ... A sibilant is a type of fricative, made by speeding up air through a narrow channel and directing it over the sharp edge of the teeth. ... A stop or plosive or occlusive is a consonant sound produced by stopping the airflow in the vocal tract. ... The space between the vocal cords is called the glottis. ... spectrogram of American English vowels [i, u, É‘] showing the formants F1 and F2 A formant is a peak in an acoustic frequency spectrum which results from the resonant frequencies of any acoustical system. ...


LPC analyzes the speech signal by estimating the formants, removing their effects from the speech signal, and estimating the intensity and frequency of the remaining buzz. The process of removing the formants is called inverse filtering, and the remaining signal after the subtraction of the filtered modeled signal is called the residue.


The numbers which describe the intensity and frequency of the buzz, the formants, and the residue signal, can be stored or transmitted somewhere else. LPC synthesizes the speech signal by reversing the process: use the buzz parameters and the residue to create a source signal, use the formants to create a filter (which represents the tube), and run the source through the filter, resulting in speech.


Because speech signals vary with time, this process is done on short chunks of the speech signal, which are called frames; generally 30 to 50 frames per second give intelligible speech with good compression.


Early history of LPC

According to Robert M. Gray of Stanford University, the first ideas leading to LPC started in 1966 when S. Saito and F. Itakura of NTT described an approach to automatic phoneme discrimination that involved the first maximum likelihood approach to speech coding. In 1967, John Burg outlined the maximum entropy approach. In 1969 Itakura and Saito introduced partial correlation, May Glen Culler proposed realtime speech encoding, and B. S. Atal presented an LPC speech coder at the Annual Meeting of the Acoustical Society of America. In 1971 realtime LPC using 16-bit LPC hardware was demonstrated by Philco-Ford; four units were sold. The Leland Stanford Junior University, commonly known as Stanford University (or simply Stanford), is a private university located approximately 37 miles (60 kilometers) southeast of San Francisco and approximately 20 miles northwest of San José in an unincorporated area of Santa Clara County. ... Nippon Telegraph and Telephone (日本電信電話 Nippon Denshin Denwa) is a telephone company that dominates the telecommunication market in Japan. ... Maximum likelihood estimation (MLE) is a popular statistical method used to make inferences about parameters of the underlying probability distribution from a given data set. ... The principle of maximum entropy is a method for analyzing the available information in order to determine a unique epistemic probability distribution. ... The Acoustical Society of America is an international scientific society dedicated to increasing and diffusing the knowledge of acoustics and its practical applications. ... Philco, the Philadelphia Electric Company (formerly known as the Spencer Company), was a pioneer in early radio and television and former employer of Philo Farnsworth, inventor of cathode ray tube television. ...


In 1972 Bob Kahn of ARPA, with Jim Forgie (Lincoln Laboratory, LL) and Dave Walden (BBN Technologies), started the first developments in packetized speech, which would eventually lead to Voice over IP technology. In 1973, according to Lincoln Laboratory informal history, the first realtime 2400 bit/s LPC was implemented by Ed Hofstetter. In 1974 the first realtime two-way LPC packet speech communication was accomplished over the ARPANET at 3500 bit/s between Culler-Harrison and Lincoln Laboratories. In 1976 the first LPC conference took place over the ARPANET using the Network Voice Protocol, between Culler-Harrison, ISI, SRI, and LL at 3500 bit/s. And finally in 1978, Vishwanath et al. of BBN developed the first variable-rate LPC algorithm. Robert E. Kahn, (born December 23, 1938), along with Vinton G. Cerf, invented the TCP/IP protocol, the technology used to transmit information on the modern Internet. ... The Defense Advanced Research Projects Agency (DARPA) is an agency of the United States Department of Defense responsible for the development of new technology for use by the military. ... MIT Lincoln Laboratory, also known as Lincoln Lab, is a federally funded research and development center managed by the Massachusetts Institute of Technology and funded by the United States Department of Defense. ... BBN Technologies (originally Bolt Beranek and Newman) is a high-technology company that provides research and development services. ... An overview of how VoIP works A typical analog telephone adapter for connecting an ordinary phone to a VoIP network Voice over Internet Protocol, also called VoIP, IP Telephony, Internet telephony, Broadband telephony, Broadband Phone and Voice over Broadband is the routing of voice conversations over the Internet or through... The Network Voice Protocol (NVP) was a pioneering computer network protocol for transporting human speech over packetized communications networks. ...


LPC coefficient representations

LPC is frequently used for transmitting spectral envelope information, and as such it has to be tolerant for transmission errors. Transmission of the filter coefficients directly (see linear prediction for definition of coefficients) is undesirable, since they are very sensitive to errors. In other words, a very small error can distort the whole spectrum, or worse, a small error might make the prediction filter unstable. Linear prediction is a mathematical operation where future values of a digital signal are estimated as a linear function of previous samples. ...


There are more advanced representations such as Log Area Ratios (LAR), line spectral pairs (LSP) decomposition and reflection coefficients. Of these, especially LSP decomposition has gained popularity, since it ensures stability of the predictor, and spectral errors are local for small coefficient deviations. Log Area Ratios (LAR) can be used to represent Reflection Coefficients (another from for Linear Prediction Coefficients) for transmission over a channel. ... Line Spectral Pairs (LSP) are used to represent Linear Prediction Coefficients (LPC) for transmission over a channel. ... Levinson recursion is a mathematical procedure which recursively calculates the solution to a Toeplitz matrix. ...


Applications

LPC is generally used for speech analysis and resynthesis. It is used as a form of voice compression by phone companies, for example in the GSM standard. It is also used for secure wireless, where voice must be digitized, encrypted and sent over a narrow voice channel. Not to be confused with Get Some Mates The Global System for Mobile Communications (GSM) is the most popular standard for mobile phones in the world. ... Communications security (COMSEC): Measures and controls taken to deny unauthorized persons information derived from telecommunications and ensure the authenticity of such telecommunications. ... Digitizing, or digitization, is the process of turning an analog signal into a digital representation of that signal. ... This article is about algorithms for encryption and decryption. ...


LPC synthesis can be used to construct vocoders where musical instruments are used as excitation signal to the time-varying filter estimated from a singer's speech. This is somewhat popular in electronic music. Paul Lansky made the well-known computer music piece notjustmoreidlechatter using linear predictive coding.[1] A 10th-order LPC was used in the popular 1980's Speak & Spell educational toy. A vocoder (name derived from voice encoder, formerly also called voder) is a speech analyzer and synthesizer. ... Electronic music is a term for music created using electronic devices. ... Paul Lansky (born 1944) is widely considered one of the original electronic music or computer music composers, and has been producing works from the 1970s up to the present day (see discography, below). ... More Than Idle Chatter is a collection of music by Paul Lansky. ... The Speak & Spell was a popular and revolutionary electronic toy consisting of a speech synthesizer and a keyboard. ...


Waveform ROM in digital sample-based music synthesizers made by Yamaha Corporation is compressed using LPC algorithm. Look up ROM in Wiktionary, the free dictionary. ... Sample-based synthesis is a form of audio synthesis that can be similar in structure to either subtractive synthesis or additive synthesis. ... A synthesizer (or synthesiser) is an electronic musical instrument designed to produce electronically generated sound, using techniques such as additive, subtractive, FM, physical modelling synthesis, phase distortion, or Scanned synthesis. ... The Yamaha Corporation (ヤマハ株式会社; TYO: 7951 ) is a Japanese company with a large number of product areas. ...


0-to-4th order LPC predictors are used in FLAC audio codec. FLAC, an acronym for Free Lossless Audio Codec, is a popular file format for audio data compression. ...


References

See also

Warped Linear Predictive Coding (Warped LPC or WLPC) is a variant of Linear predictive coding in which the spectral representation of the system is modified, for example by replacing the unit delays used in an LPC implementation with first-order allpass filters. ... The Akaike information criterion (AIC) (pronounced, approximately, ah-kah-ee-kay), developed by Professor Hirotugu Akaike (赤池 弘次) in 1971 and proposed in 1974, is a statistical model fit measure. ... Audio compression can mean two things: Audio data compression - in which the amount of data in a recorded waveform is reduced for transmission. ... FS-1015 is a secure telephony speech coding standard developed by the U.S. Department of Defence (DoD) and later by NATO. The standard was finished 1984. ... FS-1016 is a secure telephony speech coding standard developed by the U.S. Department of Defence (DoD). ...

External links


  Results from FactBites:
 
Linear predictive coding - Wikipedia, the free encyclopedia (791 words)
Linear predictive coding (LPC) is a tool used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model.
LPC starts with the assumption that a speech signal is produced by a buzzer at the end of a tube (voiced sounds), with occasional added hissing and popping sounds (sibilants and plosive sounds).
LPC analyzes the speech signal by estimating the formants, removing their effects from the speech signal, and estimating the intensity and frequency of the remaining buzz.
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.