Вы находитесь на странице: 1из 10

Artificial Intelligence

Presented by:

A.Sowmya Ch.sushma
annam.sowmya@gmail.com sushma195@yahoo.com
III/IV B.TECH III/IV B.TECH

R.V.R&J.C COLLEGE OF ENGINEERING


CHOWDAVARAM

GUNTUR

1
Abstract 1. Spectral subtraction of DFT
coefficients
2. MMSE techniques to
estimate the DFT
coefficients of corrupted
speech
DEFINITION: 3. Spectral equalization to
It is the science and engineering of making compensate for convoluted distortions
intelligent machines, especially intelligent 4. Spectral subtraction and
computer programs. spectral equalization.

APPLICATIONS:
• Game Playing
• Speech Recognition CONCLUSION:
• Understanding Natural Language
By using this speaker recognition
• Computer Vision technology we can achieve many uses.
• Expert Systems This technology helps physically
• Robotics challenged skilled persons. These people
can do their works by using this
technology with out pushing any buttons.
This ASR technology is also used in
SPEECH RECOGNITION: military weapons and in Research centers.
Now a days this technology was also used
Artificial intelligence involves two basic by CID officers. They used this to trap the
ideas. First, it involves studying the criminal activities.
thought processes of human beings.
Second, it deals with representing those
processes via machines (like computers,
robots, etc.).

One of the main benefits of speech


recognition system is that it lets user do
other works simultaneously. The user can
concentrate on observation and manual
operations, and still control the machinery
by voice input commands.

A number of algorithms for speech


enhancement have been proposed. These
include the following:

2
8. SPEECH RECOGNITION

INDEX
9. APPLICATIONS

Concepts

1. INTRODUCTION 10. GOAL

11. CONCLUSION
2. DEFINITION

3. HISTORY 12. BIBLIOGRAPHY

4. FOUNDATION

5. SPEAKER INDEPENDENCY

6. ENVIRONMENTAL

INFLUENCE

7. SPEAKER SPECIFIC

FEATURES

Artificial Intelligence For Speech


Recognition

Introduction:

Artificial intelligence involves two basic


ideas. First, it involves studying the

3
thought processes of human beings. • Work started soon after World-
Second, it deals with representing those WarII.
processes via machines (like computers, • Name is coined in 1957.
robots, etc.). • Several names that are proposed
are…
AI is behavior of a machine, which, if • Complex Information
performed by a human being, would be Processing
called intelligent. It makes machines
• Heuristic programming
smarter and more useful, and is less
• Machine Intelligence
expensive than natural intelligence.
• Computational Rationally

Natural language processing (NLP) refers


Foundation:
to artificial intelligence methods of
communicating with a computer in a
• Philosophy
natural language like English. The main
(428 B.C.-present)
objective of a NLP program is to
• Mathematics
understand input and initiate action.
(c.800-present)
• Economics
Definition:
(1776-present)

It is the science and engineering of making • Neuroscience

intelligent machines, especially intelligent (1861-present)

computer programs. • Psychology


(1879-present)

AI means Artificial Intelligence. • Computer Engineering

Intelligence” however cannot be defined (1940-present)

but AI can be described as branch of • Control theory and cybernetics


computer science dealing with the (1948-present)
simulation of machine exhibiting • Linguistics
intelligent behavior. (1957-present)

History: Speaker independency:

4
The speech quality varies from person to Real applications demand that the
person. It is therefore difficult to build an performance of the recognition system be
electronic system that recognizes unaffected by changes in the environment.
everyone’s voice. By limiting the system However, it is a fact that when a system is
to the voice of a single person, the system trained and tested under different
becomes not only simpler but also more conditions, the recognition rate drops
reliable. The computer must be trained to unacceptably. We need to be concerned
the voice of that particular individual. about the variability present when
Such a system is called speaker-dependent different microphones are used in training
system. and testing, and specifically during
development of procedures. Such care can
Speaker independent systems can be used significantly improve the accuracy of
by anybody, and can recognize any voice, recognition systems that use desktop
even though the characteristics vary microphones.
widely from one speaker to another. Most
of these systems are costly and complex. Acoustical distortions can degrade
Also, these have very limited vocabularies. the accuracy of recognition systems.
Obstacles to robustness include additive
It is important to consider the environment noise from machinery, competing talkers,
in which the speech recognition system reverberation from surface reflections in a
has to work. The grammar used by the room, and spectral shaping by
speaker and accepted by the system, noise microphones and the vocal tracts of
level, noise type , position of the individual speakers. These sources of
microphone, and speed and manner of the distortions fall into two complementary
user’s speech are some factors that may classes; additive noise and distortions
affect the quality of speech recognition. resulting from the convolution of the
speech signal with an unknown linear
system.

A number of algorithms for speech


enhancement have been proposed. These
include the following:
Environmental influence:

5
1. Spectral subtraction of DFT regression coefficients. A spectral
coefficients envelope reconstructed from a truncated
2. MMSE techniques to set of spectral coefficients is much
estimate the DFT smoother than one reconstructed from
coefficients of corrupted LPC coefficients.
speech
3. Spectral equalization to Therefore, it provides a more stable
compensate for convoluted distortions representation from one repetition to
4. Spectral subtraction and another of a particular speaker’s
spectral equalization. utterances.

Although relatively successful, all As for the regression coefficients,


these methods depend on the assumption typically the first and second order
of independence of the spectral estimates coefficients are extracted at every frame
across frequencies. Improved performance period to represent the spectral dynamics.
can be got with an MMSE estimator in
which correlation among frequencies is These coefficients are derivatives
modeled explicitly. of the time function of the spectral
coefficients and are called the delta and
delta-delta-spectral coefficients
Speaker-specific features: respectively.

Speaker identity correlates with the


physiological and behavioral
characteristics of the speaker. These
DISPLAY
characteristics exist both in the vocal tract
characteristics and in the voice source
characteristics, as also in the dynamic
features spanning several segments.

The most common short-term DICTATING


spectral measurements currently used are SPEAKER SPEECH
APPLICATIONS
RECOGNITION COMMANDS TO
the spectral coefficients derived from the COMPUTERS
DEVICE
Linear Predictive Coding (LPC) and their
INPUT TO OTHER
CBISs, ROBOTS,
EXPERT SYSTEMS
6
DIALOG WITH USER NLP UNDERSTANDING

Figure 3 Speaker specific features

Speech Recognition:

The user communicates with the


application through the appropriate input
device i.e. a microphone. The Recognizer
converts the analog signal into digital
signal for the speech processing. A stream
of text is generated after the processing.
This source-language text becomes input
to the Translation Engine, which converts
it to the target language text.

7
Salient Features:

 Input Modes
 Through Speech Engine
 Through soft copy
Interactive Graphical
User Interface
Format Retention
Fast and standard
translation
Interactive
Preprocessing tool
 Spell checker.
 Phrase marker
 Proper noun, date and other
package specific identifier
Input Format
 txt,.doc.rtf

8
User friendly attention on the images rather than writing
selection of multiple output the text.
Online thesaurus for
selection of contextually appropriate Voice recognition could also be used on
synonym computers for making airline and hotel

Online word reservations. A user requires simply

addition, grammar creation and updating stating his needs, to make reservation,

facility cancel a reservation, or making enquiries

Personal account about schedule.

creation and inbox management

Applications:
One of the main benefits of speech
recognition system is that it lets user do
other works simultaneously. The user can
concentrate on observation and manual
operations, and still control the machinery
by voice input commands.
Another major application of speech
processing is in military operations. Voice
control of weapons is an example. With
reliable speech recognition equipment,
pilots can give commands and information
to the computers by simply speaking into
their microphones - they don’t have to use
their hands for this purpose.

Another good example is a radiologist


RAM
scanning hundreds of X-rays, ultra DIGITESED
SPEECH
sonograms, CT scans and simultaneously
dictating conclusions to a speech
recognition system connected to word
processors. The radiologist can focus his

9
Now a days this technology was also used
by CID officers. They used this to trap the
criminal activities.

I
N
P
BPF ADC
U
T
BPF ADC
C TEMPLATES
I
BPF ADC
R
C SEARCH AND
BPF ADC U PATTERN
I MATCHING
T PROGRAM
S

OUTPUT CPU
CIRCUITS

Figure 4: Speaker-dependent word Bibliography:


recognizer
Ultimate Goal: www.google.co.in/Artificial intelligence
for speech recognition
The ultimate goal of the Artificial www.google.com
Intelligence is to build a person, or, more www.howstuffworks.com
humbly, an animal.
www.ieeexplore.ieee.org

Conclusion:

By using this speaker recognition


technology we can achieve many uses.
This technology helps physically
challenged skilled persons. These people
can do their works by using this
technology with out pushing any buttons.
This ASR technology is also used in
military weapons and in Research centers.

10

Вам также может понравиться