Showing posts with label Signal Processing. Show all posts
Showing posts with label Signal Processing. Show all posts

Thursday, September 10, 2015

LPC(Linear Predictive Coding)


It is one of the important method for speech analysis because it can provide an estimate of the poles (hence the formant frequency- produced by vocal tract) of the vocal tract transfer function. LPC (Linear Predictive Coding) analyzes the speech signal by estimating the formants, removing their effects from the speech signal, and estimating the intensity and frequency of the remaining buzz. The process of removing the formants is called inverse filtering and the remaining signal is called the residue. The basic idea behind LPC coding is that each sample can be approximated as a linear combination of a few past samples. The linear prediction method provides a robust, reliable, and accurate method for estimating the parameters. The computation involved in LPC processing is considerably less than cepstrum analysis.


Liftering


Liftering operation is similar to filtering operation in the frequency domain where a desired quefrency region for analysis is selected by multiplying the whole cepstrum by a rectangular window at the desired position. There are two types of liftering performed, low-time liftering and high-time liftering. Low-time liftering operation is performed to extract the vocal tract characteristics in the quefrency domain and high-time liftering is performed to get the excitation characteristics of the analysis speech frame.

Thursday, September 3, 2015

Frame Blocking of speech signal


In this step, the continuous speech signal is blocked into frames of N samples, with adjacent frames being separated by M (M < N). The first frame consists of the first N samples. The second frame begins M samples after the first frame, and overlaps it by N – M samples. Similarly, the third frame begins 2M samples after the first frame (or M samples after the second frame) and overlaps it by N - 2M samples. This process continues until all the speech is accounted for within one or more frames. Frame blocking of the speech signal is done because when examined over a sufficiently short period of time, its characteristics are fairly stationary. However, over long periods of time the signal characteristic change to reflect the different speech sounds being spoken. Overlapping frames are taken not to have much information loss and to maintain correlation between the adjacent frames.

Tuesday, September 1, 2015

What is white noise?

White noise is a type of noise that is produced by combining sounds of all different frequencies together. If you took all of the imaginable tones that a human can hear and combined them together, you would have white noise.
The adjective "white" is used to describe this type of noise because of the way white light works. White light is light that is made up of all of the different colors (frequencies) of light combined together. In the same way, white noise is a combination of all of the different frequencies of sound. You can think of white noise as 20,000 tones all playing at the same time. Because white noise contains all frequencies, it is frequently used to mask other sounds.


Spectral and Temporal Features in Speech signal

There are two types of features of a speech signal :
  • The temporal features (time domain features), which are simple to extract and have easy physical interpretation, like: the energy of signal, zero crossing rate, maximum amplitude, minimum energy, etc.
  • The spectral features (frequency based features), which are obtained by converting the time based signal into the frequency domain using the Fourier Transform, like: fundamental frequency, frequency components, spectral centroid, spectral flux, spectral density, spectral roll-off, etc. These features can be used to identify the notes, pitch, rhythm, and melody.

The most successful spectral features used in speech are (i) Mel frequency cepstral coefficients (MFCC) and (ii) Perceptive Linear Prediction (PLP) features. It is well known that the basilar membrane in the inner ear actually analyzes the frequency content of the speech we hear. In fact, the analysis of basilar membrane can be modeled by a bank of constant Q, band pass filters. There also exist the critical bands, which give rise to the phenomenon of masking - where one strong tone or burst can mask another weaker tone within the critical band. Actually, both MFCC and PLP capture these characteristics of our auditory system in some way; so, even though it looks strange, the same features give reasonably good performance for speech recognition, speaker recognition, language identification and even accent identification ! However, these spectral features are not very robust to noise.

On the other hand, some of the time domain (temporal) features such as plosion index and maximum correlation coefficient are relatively more robust to noise.

Wednesday, August 12, 2015

Remove silent region in audio signal

[ip,fs]=audioread('so1.wav');
%plot(ip);
% step 1 - break the signal into frames of 0.1 seconds
fs = 11000; % sampling frequency
frame_duration = 0.04;
frame_len = frame_duration*fs;
N = length(ip);
num_frames= floor(N/frame_len);

new_sig = zeros(N,1);
count=0;
for k = 1 : num_frames
    % extracting a frame of speech
    frame = ip( (k-1)*frame_len + 1 : frame_len*k );
    % step 2 - identify non silence frames by finding frames with max amplitute more than
    % 0.03
   
    max_val = max(frame);
    if(max_val > 0.03)
        count=count+1;
        new_sig((count-1)*frame_len + 1 : frame_len*count) = frame;
    end
end

new_sig(frame_len*count:end)=[];
 plot(new_sig);

Filters

Low Pass filter

High Pass filter
Narrow Band Pass filter

Wide Band Pass filter
Notch filter

Band Reject filter
 

Friday, August 7, 2015

dir() function

dir(name) 

attribute_name= dir(name)

  • dir(name) lists files and folders that match the string name. When name is a folder, dir lists contents of the folder.
  • attribute_name= dir(name) returns attributes about name.

 

disp() function

disp(x)

disp(x) displays the contents of x without printing the variable name

input() function

result = input(prompt)

str = input(prompt, 's')


  • result = input(prompt) displays the prompt string on the screen, waits for input from the keyboard, evaluates any expression in the input, and returns the result.
  • str = input(prompt, 's') returned the entered text as Matlab string, without evaluating expression.

length() function

 num = length(array) 

 length(array) returns length along the largest dimension of the array or matrix.


% Example
A = [101, 20;
    10, 24;
    11, 7];

%Function
ANSWER = length(A);

%Display result 
disp('Result: ');
disp(ANSWER);

Result :

3

zeros() function

X = zeros(n)

X = zeros(m, n)

  • zeros(n) returns n by n matrix of zeros
  • zeros(m, n) returns a m by n matrix of zeros

ones() function



X = ones(n) 

X = ones(m, n)

·         ones(n) returns n by n matrix of ones

·         ones(m, n) returns a m by n matrix of ones


Recording sound from Microphone



MATLAB command "wavrecord" to read the audio signals from the microphone directly. The command format is

y = wavrecord(n, fs);



fs=16000; % Sampling rate

duration=2; % Recording duration

fprintf('Press any key to start  %g seconds of recording...', duration); pause

fprintf('Recording...');

y=wavrecord(duration*fs, fs);  % duration*fs is the total number of sample points

fprintf('Finished recording.\n');

 fprintf('Press any key to play the recording...'); pause;

fprintf('\n');

wavplay(y,fs);

Thursday, August 6, 2015

Band Pass Filter


There are applications where a particular band, or spread, or frequencies need to be filtered from a wider range of mixed signals. Filter circuits can be designed to accomplish this task by combining the properties of low-pass and high-pass into a single filter. The result is called a band-pass filter. Creating a bandpass filter from a low-pass and high-pass filter can be illustrated using block diagrams:
The main function of such a filter in a transmitter is to limit the bandwidth of the output signal to the minimum necessary to convey data at the desired speed and in the desired form. In a receiver, a band-pass filter allows signals within a selected range of frequencies to be heard or decoded, while preventing signals at unwanted frequencies from getting through.


There are basically two types of bandpass filters wide bandpass and narrow bandpass filters.