IPU M.tech Desrt

a) (b) (c) (d) (e) (f)
Title Introduction of the proposed research work. Objectives of the proposed research work. Software & Hardware Tools to be used References GANTT Chart for the proposed dissertation work
Ppt ---Introduction ,review ,objective,references
Introduction of the proposed research work Speech is the most basic of the means of human communication .Research in speech makes significant progress in speech synthesizers, speech transmission systems and automatic speech recognition speech recognition (ASR) which is based on the voice. Speech recognition is the translation of spoken words into text. It is also known as "automatic speech recognition", "ASR", "computer speech recognition", "speech to text" etc.. There are two types of speech recognition. One is called speaker dependent and the other is speaker independent. Speakerindependent software is designed to recognize anyone's voice. This speech recognition system basically consists of four modules. Those are Speech Signal acquisition, Feature Extraction, Modeling Speech Processing Module, Pattern Matching Module and Output Module . Speech Processing
include Filter Bank Analysis , Linear Predictive Coding(LPC) Analysis, Mel cepstral Analysis techn
Speech recognition system can be separated in different classes by describing what type of utterances they can recognize. Isolated Word Isolated word recognizes attain usually require each utterance to have quiet on both side of sample windows. It accepts single words or single utterances at a time .This is having Listen and Non Listen state. 1.1.2 Connected Word
Connected word system are similar to isolated words but allow separate utterance to be run together minimum pause between them. 1.1.3 Continuous speech Continuous speech recognizers allows user to speak almost naturally, while the computer determine the content.Recognizer with continues speech capabilities are some of the most difficult to create because they utilize special method to determine utterance boundaries Spontaneous Speech At a basic level, it can be thought of as speech that is natural sounding and not rehearsed. An ASR system with spontaneous speech ability should be able to handle a variety of natural speech features Automatic Speech Recognition system Classification: Automatic Speech Recognition systems can be classified as shown in
. Speech Processing Classification
speaker recognisation system may be viewed as working in a four stages
ABSTRACT
In Speech Recognition System, we will get input signal from the user. This input signal, which is analog in nature, is fed through the microphone. Then it is passed through an analog to digital converter to obtain the digital form of the signal. This process is generally called sampling. These digitized voice signals are then stored in wave format by the windows 7 operating system.
The digitized signal is then processed by the system. This is to extract the fundamental or formant frequencies from the signal. For this the signal is made to pass through the set of 16 band pass filters known as filter banks. These band pass filters are implemented using inbuilt function from the Matlab. This divides the signal into 16 frequency bands. Frequencies corresponding to maximum intensity in each band are determined.
The frequencies or the Formants extracted are then matched with a pre-recorded set of frequencies known as TEMPLATES to determine the word spoken. Taking point-to-point difference between the outputs of the 16 filters for both the spoken word and the stored template does the matching by selecting the template with the minimum distance.
Introduction
Speech is the most basic of the means of human communication.It is the way of interaction between people. Research in speech makes significant progress in speech synthesizers, speech transmission systems and automatic speech recognition speech recognition(ASR) which is based on the voice
Speech recognition involves many fields of physiology,psychology, linguistics, computer science and signal processing,and is even related to the persons body language, and its ultimate goal is to achieve natural language communication between man and machine.It has a very close relationship with acoustics , phonetics, linguistics, information theory, pattern recognition theory and neurobiology disciplines.Speech recognition automatically identify and understand human spoken language through speech signal processing and pattern recognition People finally breakthrough of the three major obstacles, including large vocabulary, continuous speech and non-specific speech recognition to compare the voice template stored in the computer and the characteristics of the input voice signal Poor adaptability of the speech recognition system is mainly reflected in the dependence on the
environment, progress of speech recognition in noisy environments isvery difficult, because at this time people's pronounce varies greatly , like voice, slow speech rate, pitch and formant changes, which is the Lombard effect,
Recognition of all information in human speech: transformation of speech into text, processing of meaning (understanding) and modality, recognition of prosody, recognition of language, speaker, emotion, intention and speech style and the recognition of the health state of speaker.It is presented on Figure
Speech recognition system available in todays market are simple speech-to-text transformation system.These are voice dialing , Controlling household items, cars,robots,different dictation system and so on .Now you can talk to your TV , tablet,PC,phone ,car and get more done.These speech to-text systems are spreading in the mobile world because of the difficulties in the usage of small sized keyboards
Application
Speech Recognition is potentially very useful and represents a big market. It has many different applications which include the following: office or business systems: dictation, translation, data entry onto forms, database management/control. manufacturing: eyes-free, hands-free monitoring of processes - quality control for manufacturing processes. telephone or telecommunication: services of speech recognition could cut through the menu hierarchy . and remove the need of a series of touch-tone buttons medical: voice creating and editing of medical reports other: voice controlled games and toys, cars, operating systems tasks. A typical application is the command and control. Activating the system using speech (isolated word, phrase or connected sequence of words). Speech recognition depends on a lot of different context variations and environmental conditions. Ideally speech recognition should be speaker independent, accept natural language and unrestricted lexicon [4]. This perfect scenario must be replaced in a real scenario context considering that human hearing is much more complex than the signal processing techniques. Some issues to be taken into account are : Background noise: fans, computers, machinery running. Speech interference: TV, radio background conversation. Sound reflections due to the room geometry. Non stationary events: door slams, irregular road noise, car horns. Signal degradation : microphone and transmission system distortions. Unknown words: improper English grammar, unfamiliar accent, out-of-vocabulary words. Unusual circumstances: stressed speaker. Speaker sound artifacts: speaker lip smacks, heavy breathing, mouth clicks and pops. Background noise and speech interference as well as acoustic reflection can be reduced by choosing a proper microphone set close to the speaker and pointed to the desired source, while non stationary noise is difficult to handle and will probably lead to a bad trial that must be repeated
The system consists of four modules. Each of the modules perform a particular task of the system.
2.1 Input Module

This module takes the input from the user. The input is in the form of analog speech command. This input is digitized by the microphone chip before being sent to the system. Thus the system gets digitized speech signal as input.
2.2 Processing Module

Digitized speech signal obtained as input is processed by this module to extract the formant frequencies. These formant frequencies are used to recognize the word spoken, as these are different for different phonemes.
2.3 Pattern Matching Module

Formant frequencies extracted by the processing module are used by the pattern matching module to determine the spoken word.
2.4 Output Module

Once the word has been determined the output module generates the appropriate Operating System command to make the Operating System take proper action which depends upon the audio command sent by the user.
Input Speech Signal
Similarity
Decision Box
Word Recognized
Box
Feature Extraction
Reference Stored Template Model
Threshold
Fig 2.1 Basic structure of speech recognition systems
7.1 MATLAB
MATLAB (matrix laboratory) is a numerical computing environment and fourth-generation programming language. Developed by MathWorks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages, including C, C++, Java, and Fortran / write package you have used / In 2012, MATLAB had around ten million users across industry and academia. MATLAB users come from various backgrounds of engineering, science, and economics. MATLAB is widely used in academic and research institutions as well as industrial enterprises.
/see now :VARIABLES

Variables are defined using the assignment operator, =. MATLAB is a dynamically typed programming language. It is a weakly typed language because types are implicitly converted. It is a dynamically typed language because variables can be assigned without declaring their type, except if they are to be treated as symbolic objects, and that their type can change. Values can come from constants, from computation involving values of other variables, or from the output of a function.
For example:
>> x = 17 x = 17 >> x = 'hat' x =hat >> y = x + 0 y = 104 97 116 >> x = [3*4, pi/2] x = 12.0000 1.5708 >> y = 3*sin(x) y = -1.6097 3.0000

IPU M.tech Desrt

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

IPU M.tech Desrt

Загружено:

Авторское право:

Доступные форматы

a) (b) (c) (d) (e) (f)

Ppt ---Introduction ,review ,objective,references

. Speech Processing Classification

speaker recognisation system may be viewed as working in a four stages

2.1 Input Module

2.2 Processing Module

2.3 Pattern Matching Module

2.4 Output Module

Input Speech Signal

Reference Stored Template Model

Fig 2.1 Basic structure of speech recognition systems

/see now :VARIABLES

Вам также может понравиться