Signal processing: Cepstrum Domain

Speech is composed of an excitation sequence convolved with the impulse response of the vocal system model. It is often desirable to eliminate one of the components so that the other may be used in a recognition algorithm. Cepstrum is a common transform, which can be used to separate the excitation signal (which contains the phones and the pitch) and the transfer function (which contains the voice quality). These two portions are convolved in the time domain, but convolution in time domain becomes multiplication in

frequency domain, which could be represented as,

X(w) =G(w)H(w)

When a log of the magnitude of both sides of the transform is taken,

log | X(w) |= log |G(w) | +log | H(w) |

Taking IDFT on both sides of the above equation, introduces us to a term called “Quefrency”, which is the x-axis of the cepstrum domain.

This process is better understood with the help of a block diagram .A lifter is used to separate the high quefrency (Excitation) from the low quefrency (Transfer Function).

Signal processing

Thursday, September 10, 2015

Cepstrum Domain

No comments:

Post a Comment