Вы находитесь на странице: 1из 16

Indian Institute of Technology, INDORE

EE202 : SIGNAL AND SYSTEMS

Instructor: Dr. R. B. Pachori.


Group No.
Group Members :
Name, Roll No.
Email ID :

19
1. Aditi Kanjolia , 1200202
2. Keerthana Sravanthi, 1200313
ee1200202@iiti.ac.in

Contents

Problem Statement and Objective____________________________________ 3


Introduction____________________________________________________________4
MATLAB Code__________________________________________________________8
Implementation________________________________________________________ 9
Bibliography___________________________________________________________16

PROBLEM STATEMENT AND


OBJECTIVE

CLASSIFICATION OF VOICED
And
UNVOICED SPEECH SIGNAL
Using
FOURIER TRANSFORM

Introduction

Speech is an acoustic signal produced from a speech production system. From our
understanding of signals and systems, the system characteristics depend on the design of
the system. For the case of linear time invariant system, this is completely characterized in
terms its impulse response. However, the nature of response depends on the type of input
excitation to the system. A similar phenomenon happens in the production of speech also.
Based on the input excitation phenomenon, the speech production can be broadly
categorized into three activities. The first case where the input excitation is nearly periodic
in nature, the second case where the input excitation is random noise-like in nature and
third case where there is no excitation to the system. Accordingly, the speech signal can
be broadly categorized into three regions- voiced, unvoiced and silence speech.
Our aim is to classify between voiced and unvoiced speech.
Voiced sounds consist of fundamental frequency and its harmonic components produced by
vocal cords (vocal folds). The vocal tract modifies this excitation signal causing formant
(pole) and sometimes anti-formant (zero) frequencies. With purely unvoiced sounds, there
is no fundamental frequency in excitation signal and therefore no harmonic structure. The
airflow is forced through a vocal tract constriction which can occur in several places
between glottis and mouth. Some sounds are produced with complete stoppage of airflow
followed by a sudden release, producing an impulsive turbulent excitation often followed by
a more protracted turbulent excitation. Unvoiced sounds are also usually more silent and
less steady than voiced ones.
Voiced sounds, e.g., a, b, are essentially due to vibrations of the vocal cords, and are
oscillatory. Therefore, over short periods of time, they are well modelled by sums of
sinusoids. This makes short-time Fourier transform, a useful tool for speech processing.
Unvoiced sounds such as s, sh, are more noise-like, as shown in figure below. They have
wide band spectrum.

Figure- Distinction between voiced and unvoiced speech.


For many speech applications, it is important to distinguish between voiced and unvoiced
speech. There are many ways of doing it. We will use a basic method to do this classification
and it is based on the concept of formants and the use of Fourier Transform.

FormantsWikipedia defines Formants as the spectral peaks of the sound spectrum of the voice". In
speech science and phonetics, formant is also used to mean an acoustic resonance of the
human vocal tract. It is often measured as an amplitude peak in the frequency spectrum of
the sound, though in vowels spoken with a high fundamental frequency, as in a female or
child voice, the frequency of the resonance may lie between the widely-spread harmonics
and hence no peak is visible.

Fourier TransformThe Fourier transform, named after Joseph Fourier, is a mathematical transformation
employed to transform signals between time domain and frequency domain, which has
many applications in physics and engineering.
The Fourier Transform decomposes any function into a sum of sinusoidal basis functions.
Each of these basis functions is a complex exponential of a different frequency. The Fourier
Transform therefore gives us a unique way of viewing any function - as the sum of simple
sinusoids.
The Fourier Series showed us how to rewrite any periodic function into a sum of sinusoids.
The Fourier Transform is the extension of this idea to non-periodic functions.
The Fourier Transform of a function g(t) is defined by:

[Equation 1]

The result is a function of f, or, frequency. As a result, G(f) gives how much power g(t)
contains at the frequency f. G(f) is often called the spectrum of g. In addition, g can be
obtained from G via the inverse Fourier Transform:
[Equation 2]
Equation [2] states that we can obtain the original function g(t) from the function G(f) via
the inverse Fourier transform. As a result, g(t) and G(f) form a Fourier Pair: they are distinct
representations of the same underlying identity. We can write this equivalence via the
following symbol:
[Equation 3]

Given below is a table of few examples of some alphabets with their classification. And in
parentheses are their phonetic transcriptions.
voiced

unvoiced

b book
(b k)

vanilla
(v nIl )

please
(pliz)

five
(faIv)

they

thirty

( eI)

d dish

ten
(t n)

sir
(s

(dI )

zero
(z

genre
(
nr )

ti)

she
( i)

MATLAB CODE

We will use a MATLAB code to do our required experimentation. We record some sounds
using wavrecord command. Then we get the Fast Fourier Transform of each of them, using
fft command and then we classify them as voiced and unvoiced speech signal.

The MATLAB code is as follows

>> Fs= 11025; % Setting frequency


>>y=wavrecord(Fs,Fs,'int16');

%Recording sound

>> figure, plot(y)% Plotting the magnitude of the signal in time


domain
>> figure, plot(abs(fft(double(y))))
frequency domain spectrum

% Plotting the

IMPLEMENTATION

The above code was implemented on some vowels and consonants (A,P,B,S,Z,T and D).
Here are the results of the same:
A

Figure P speech signal in time domain.

10

Figure P speech signal in frequency domain.


B

Figure B speech signal in time domain.

11

Figure B speech signal in frequency domain.

Figure S speech signal in time domain.

12

Figure S speech signal in frequency domain.


Z

Figure Z speech signal in time domain.

13

Figure Z speech signal in frequency domain.

Figure T speech signal in time domain.

14

Figure T speech signal in frequency domain.

Figure D speech signal in time domain.

15

Figure D speech signal in frequency domain.

BIBLIOGRAPHY

Signals and Systems, Oppenheim and Willsky


Signals and Systems Using MATLAB, Luis F. Chaparro
Separation of Voiced and Unvoiced using Zero crossing rate and
Energy of the Speech Signal -Bachu R.G., Kopparthi S., Adapa B., Barkana B.D.
Web sources- Wikipedia, Saakshat Lab, IITG.

16

Вам также может понравиться