Вы находитесь на странице: 1из 28

RS1

Ranveer Singh M.Tech(1st year) G.L.A.university


1 of 23

Slide 1 RS1
Ranveer Singh, 9/28/2011

Language is man's most important means of communication and speech its primary medium. Speech provides an international forum for communication among researchers in the disciplines that contribute to our understanding of the production, perception, processing, learning and use.With due change in the world, the communication style also changed as the world has moved towards the computer hence we need to communicate to the machine. Speech recognition is the major thing in the communication between the humen and the computer. muniction is the major

2 of 23

Speech recognition is the process of converting a speech signal to a sequence of words inthe formof digital data, by means of an algorithm implemented as a computer program. Speech recognition applications that have emerged over the last few years include voice dialing (e.g., "Call home"), call routing (e.g., "I would like to make a collect call"), simple data entry (e.g., entering a credit card number), preparation of structured documents (e.g., a radiology report), domotic appliance control and content-based spoken audio search (e.g. find a podcast where particular words were spoken).

3 of 23

Speech Recognition are technologies of particular interest, for their support of direct communication between humans and computers, through a communications mode, humans commonly use among themselves and at which they are highly skilled.
Rudnicky, Hauptman, and Lee

http://starbase.cs.trincoll.edu/~ram/cpsc352/

4 of 23

What was the first success story of speech recognition?

Radio Rex in the 1920s, was the first success story in the field of speech recognition

www.stanford.edu/class/linguist236/lec1.pdf

5 of 23

1936 - AT & Ts Bell labs started study of speech recognition (funded by DARPA) 1974 - optical character recognition 1975 text to speech synthesis ( Kurzweil reading machine) 1978 speak and spell toy released by Texas Instruments 1980 Xerox started producing reading machine Text bridge 1997 Dragon Systems produces first continuous speech recognition product
6 of 23

http://starbase.cs.trincoll.edu

Some examples to speech recognition


Dragon NaturallySpeaking 8
Developed by ScanSoft Considered leader in speech technology

IBM ViaVoice 10
Developed by IBM Marketed (recently) by ScanSoft

Microsoft Speech
Developed by Microsoft Included with Office XP and Office 2003 Considered to be cumbersome

7 of 23

Can S-R Software Do?

Play back dictation and hear text read aloud Switch between applications using voice commands Manage e-mail Create additional user voice files Browse the Web

Dictation Command and control Telephony Medical/disabilities

Fundamentals of Speech Recognition". L. Rabiner & B. Juang. 1993

9 of 23

Ease of use Robust performance Automatic learning of new words and sounds Grammar for spoken language Control of synthesized voice quality Integrated learning for speech recognition and synthesis

B.S Atal. Speech recognition in 2001: New research directions Proc.Natl.Acad.Sci USA Vol 92, pp 10046-100551Oct1995

10 of 23

Integrated conversational applications No specialized language expertise Technology independence

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

11 of 23

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

12 of 23

Audio server presents raw digitized audio to speech recognizer Swiftus parses the word list to produce a set of feature-value pairs Discourse manager maintains a stack of information about the current conversation Discourse manager and application respond to the user by sending a text string to text to speech manager

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

13 of 23

Continuous-speech recognizers require grammars that specify every possible utterance a user could say to the application The recognizer grammar should closely synchronize with the Swiftus semantic grammar Solved by inventing Unified Grammar

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

14 of 23

Semantic representation generated in real time to facilitate conversation Accurate understanding Tolerance of misrecognized words Wide variation among applications Ease of use

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

15 of 23

VOICE I/P

1011011 VOICE VOICE SPEECH CHARACTERIST CHARACTERISTIC DIGITIZATION

ICS ANALYSIS SEARCH /MATCHING RESULTS MATCH YES PROCESS MATCH

DATABASE NO

PROCESS NO MATCH

17

SPEECH AMPLITUDE SPECTRUM

VOCODE R (DSP) I/P SPECTRUM

1011010 (Flash memory)

MOMORY

CODING AND STORING OF SPEECH SPECTRUM

conversational pacing explicit error corrections define the functional boundaries of an application

Paul Martin, Fredrick Crabbe, Stuart Adams, Eric Baatz, Nicole Yankelovich. SpeechActs: A Spoken Language Framework, IEEE Computer, Vol. 29, Number 7, July 1996.

20 of 23

Voice recognition is the identification or verification of an individual identity using speech as the identifying characteristic. To identify its auditory and vocal characteristics. An individual speech spectrum is of the form as shown here.

Medical transcription mainly in radiology and pathology First use of speech recognition in the field of radiology in 1981 Mean accuracy rate of reading pathology reports, using IBM Via Voice Pro software 93.6% compared to human transcription at 99.6%

M. Al.Aynati, K.Chomeyko Comparison of Voice-automated Transcription and Human Transcription in General Pathology ReportsArch Pathol Lab Med. 2003;127:721725)

22 of 23

13% used voice recognition 16% discontinued using voice recognition 21% believed chairside computer use could be improved with better voice recognition Using an automatic speech recognition will be the way to go!!

T. Schleyer et al (unpublished data) Chairside Computer Use in Clinical Dentistry

23 of 23

LIMITATIONS OF S.R.S
The task has been viewed as one of de-sensitising recognisers to variability. It is not entirely clear that this idea models adequately the parallel process in human speech perception.

24 of 23

MERITS OF S.R.S
The uses of speech technology are wide ranging. Most effort at the moment centers around trying to provide voice input and output for information systems - say, over the telephone network. The idea is to make information availavle to those who dont want to face keyboard and screen or cannot face it

25 of 23

BIBLIOGRAPHY AKMAJIAN,ADRIAN,RICHARD A. DEMERS 1979. LINGUISTICS: AN INTRODUCTION TO LANGUAGE AND COMMUNICATIONS. ALLEN, JONATHAN,SHARON 1987. FROM TEXT TO SPEECH: THE MITALK SYSTEM. SPEECH TECHNOLOGY SPEECH. J ACOUSTICAL SOCIETY OF AMERICA BAKER JANETM .1981 .HOW TO ACHIEVE SPEECH RECOGNITION. SPEECH TECHNOLOGY http://www.microsoft.com/speech

26 of 23

Вам также может понравиться