Spectral
analysis shows that different timbres in speech signals corresponds to
different energy distribution over frequencies. Therefore we usually perform
FFT to obtain the magnitude frequency response of each frame. When we perform
FFT on a frame, we assume that the signal within a frame is periodic, and
continuous when wrapping around. If this is not the case, we can still perform
FFT but the in continuity at the frame's first and last points is likely to
introduce undesirable effects in the frequency response. To deal with this
problem, we have two strategies: Multiply each frame by a Hamming window to
increase its continuity at the first and last points. Take a frame
of a variable size such that it always contains a integer multiple number of
the fundamental periods of the speech signal. The second strategy encounters
difficulty in practice since the identification of the fundamental period is
not a trivial problem. Moreover, unvoiced sounds do not have a fundamental
period at all. Consequently, we usually adopt the first strategy to multiply
the frame by a Hamming window before performing FFT.
No comments:
Post a Comment