Вы находитесь на странице: 1из 27

SPEECH RECOGNITION 1.

ABSTRACT
This document defines a set of evaluation criteria and test methods for speech recognition systems used in vehicles !ONI" product #as found the easiest and $uic%est system to develop speech recognition soft#are& availa'le on the mar%et today The evaluation of this product in vehicle& noisy environments& and accuracy and suita'ility of control under various conditions are also included in this document for the testing purposes The effects of engine noise& interference 'y tur'ulent air outside the car& interference 'y the sounds from the car(s radio)entertainment& and interference 'y the sounds of the car(s #indshield #ipers are all considered separately Recognition accuracy #as compared using a variety of different road routines& noisy environments and languages Testing in *ideal+ non,noisy environment of a $uiet room has 'een also performed for comparison

DEPT OF ECE, PVKKIT Page -

SPEECH RECOGNITION 2.INTRODUCTION


In this report #e concentrate on the speech recognition programs that are human,computer interactive .hen soft#are evaluators o'serve humans testing such soft#are programs& they gain valua'le insights into technological pro'lems and 'arriers that they may never #itness other#ise Testing speech recognition products for universal usa'ility is an important step 'efore considering the product to 'e a via'le solution for its customers later This document concerns Speech Recognition accuracy in the automo'ile& #hich is a critical factor in the development of hands,free human,machine interactive devices There are t#o separate issues that #e #ant to test/ #ord recognition accuracy and soft#are friendliness 0a1or factors that impede recognition accuracy in the automo'ile include noise sources such as tires and #ind noise #hile the vehicle is in motion& engine noise& noises produced 'y the car radio)entertainment systems& fans& #indshield #ipers& horn& turn signals& heater& 2)C& temperature sets& cruise control speed setting& headlight& emergency flashers& and others listed 'elo# 3ut& #hat is speech recognition4 Speech recognition #or%s li%e this 5ou spea% into a microphone and the computer transforms the sound of your #ords into te6t to 'e used 'y your #ord processor or other applications availa'le on your computer The computer may repeat #hat you 1ust said or it may give you a prompt for #hat you are e6pected to say ne6t This is the central promise of interactive speech recognition Early speech recognition programs made you spea% in staccato fashion& insisting that you leave a gap 'et#een every t#o #ords 5ou also had to correct any errors virtually as soon as they happened& #hich means that you had to concentrate so hard on the soft#are that you often forgot #hat you #ere trying to say The ne# voice recognition systems are certainly much easier to use 5ou can spea% at a normal pace #ithout leaving distinct pauses 'et#een #ords Ho#ever& you cannot really use *natural speech+ as claimed 'y the manufacturers 5ou must spea% clearly& as you do #hen you spea% to a 7ictaphone or #hen you leave someone a telephone message Remem'er& the computer is relying solely on your spo%en #ords It cannot interpret your tone or inflection& and it cannot interpret your gestures and DEPT OF ECE, PVKKIT Page 8

SPEECH RECOGNITION
facial e6pressions& #hich are part of everyday human communication Some of the systems also loo% at #hole phrases& not 1ust the individual #ords you spea% They try to get information from the conte6t of your speech& to help #or% out the correct interpretation This is ho# they can 9sometimes: #or% out #hat you mean #hen there are several #ords that sound similar 9such as *to&+ *too+ and *t#o+ : ;-< There are several speech recognition soft#are programs availa'le on the mar%et no# These included Dragon Naturally Speaking ;=<, IBM ViaVoice Millenium Pro ;><, Philips FreeSpeech ;?< and Fonix Em edded Speech ;@< from !oni6 Company Anfortunately& #ith other speech recognition products& you cannot 1ust start spea%ing into the microphone There are some preliminaries Every person(s voice is different& so the systems need information a'out the #ay you spea% 'efore they turn you loose on the tas% of dictation This is called * !raining"+ 5ou are as%ed 'y the systems to read some sentences and letters 9in Philips FreeSpeech: or some more interesting e6tracts from 'oo%s 9in Dragon Naturally Speaking and IBM Via Voice Millenium: This process is not re$uired in Fonix Em edded Speech This process ta%es as little as B minutes 9Dragon Nuturally Speaking: or -C minutes 9IBM Via Voice Millenium:& up to DB minutes 9Philips FreeSpeech: and C minutes 9Fonix Em edded Speech: Overall& to me the Fonix Em edded Speech #as the easiest and $uic%est to set up 'ase on the facts listed in its attractive specification and 'enefits ;--< IBM ViaVoice Millenium #as not far 'ehind ;-C<& and pro'a'ly had 'etter introductory resources such as a video on C7 Ho#ever the training process #as at times frustrating& as the mar%ing of errors seemed to lag considera'ly 'ehind the user(s speech This seems to cause the program to identify multiple errors and slo#ed the process do#n 2ll the products have online help availa'le on their #e'sites The Fonix Em edded Speech product also has the introductory training resources availa'le online for soft#are developers #ho #ant to learn a'out the soft#are 'efore they decide to purchase it The goal of this pro1ect is to define a set of evaluation criteria and test methods for the interactive voice recognition systems such as the Fonix product used

DEPT OF ECE, PVKKIT Page E

SPEECH RECOGNITION
in vehicles 2lso& it can 'e found in the list of the details of testing re$uirements 'elo#

E DETAILS OF TESTING REQUIREMENTS


7efine a set of evaluation criteria and test methods for the interactive voice recognition systems #hen used in a vehicle 9automo'ile #ith radio turned *on+& for e6ample: The evaluator #ill need to thin% a'out #hat are those factors affecting the $uality of speech recognition 8 Evaluation of the selected system 9!ONI": in these environments 2ccuracy under various conditions Environmental Spea%er Spea%er location in vehicle Suita'ility for control of/ Entertainment Systems 9Radio& C7& etc: Environmental Controls 9Heater& 2C& Temperature Set& etc: Critical Controls 9Cruise Control Speed Setting& Headlight& emergency flashers& etc: Noise environment Train station and 'usy commute streets Siren sounds from emergency vehicles Evaluator #ill need to select and test various #ords for each

DEPT OF ECE, PVKKIT Page D

SPEECH RECOGNITION

4.PRODUCT EVALUATION
The selected system 9Fonix Em edded Speech# #as chosen for this test of accuracy in automo'iles 'ecause it is found the easiest and $uic%est system availa'le on the mar%et today for soft#are developers to 'uild a speech recognition application and for the end,users to use it ;-D< !ive adults #ith five different accents #ere involved in the tests 9- English,spea%ing man& - Indian,spea%ing man& - Chinese, spea%ing #oman& - !rench,spea%ing #oman and - Fietnamese,spea%ing man: The purpose of having these people to test the soft#are is to see ho# #ell the soft#are performances in different voices Hewlett-Pa !a"# Pa$%l%&' (t114) note'oo% #as used to test the product 9P,III - 8 GHG& B-8 03 R20: The testers varied in the se$uence in #hich they performed the tests This #ill help to get different results each time the testers spea% 0ost of the tests that #ere conducted at night times& in cold #eather& or outside #ere completed 'y the author of this document& the Fietnamese,spea%ing tester The A'#"ea A""a* DA-4++M% "&,-&'e from 2ndreaelectronics com #as used to test the product in the car The DA-4++ is a directional microphone that re1ects sound coming from the rear or side The optimum distance from the microphone to the tester is -? inches ;-E<& so it is perfect to place the microphone on the driver(s visor or on the dash at front of the driver so his or her voice can 'e pic%ed up accurately I have 'uilt several small test programs in different categories 9such as ordering pi$$as& cat and dog& map directions and phone dialer: #ith various #ords 9appro6imately from 8C,-CC #ords:& using the Fonix Em edded Speech soft#are The direction% program 9$uite a difficult one& #ith total of over -CC #ords& num'ers and commands: #as then used and scored for errors 2t the first test& the testers forgot that they need to spea% slo#er than their normal spea%ing speeds #hile testing the soft#are& therefore #e ended up having some errors that #as the soft#are could not detect their commands or sentences I then corrected the errors from the first test 'y having the testers spea% slo#er& change their tones or change the environment& and then did a second test Each tester uttered a DEPT OF ECE, PVKKIT Page B

SPEECH RECOGNITION
set of programs #ith different #ords each time at three speed limits/ C 9#ith engine idling:& EC and BC 0 P H & and the follo#ing si6 conditions in the vehicles/ 3aseline 9#indo#s up& fans& radio& and #indshield #ipers off: 7river(s #indo# do#n !ans on 9first heater on and then 2)C on: Radio)Entertainment system on .indshield #ipers on Cruise control speed setting on

The testers in turn used the follo#ing programs to test the product !or the purpose of this document& the name of each program #ill 'e named 'y num'ers
Samples Name in this report E6amples of Spo%en Narrative

7ialer num'er

Program -

HelloH .hat num'er #ould you li%e to dial4 Is that correct4 7ialing Than% you for using our 7ialer

soft#are Cat and 7og animal Is your animal a III 4 So& no# I %no# your favorite animal isII Than% you for using our Cat and 7og soft#are 9The 'lan%s are the place for the soft#are to repeat the #ords of animals that you have 1ust spo%en: Order piGGa prepare 5our piGGa for you Than% you for using our Order PiGGa Soft#are 9The 'lan% is the place for the soft#are to repeat the num'ers or #ords that you 1ust spo%en: Program E HelloH .hat order num'er #ould you li%e4 Is order Num'er III correct4 Please give us -B minutes to Program 8 HelloH .ould you li%e to tell me a'out your favorite animal4 No#& please tell me a'out your favorite

DEPT OF ECE, PVKKIT Page =

SPEECH RECOGNITION
7irectionis your origin4 Is III correct4 The direction from III to III is III .ould you li%e to get another direction or cancel4 Than% you for using our 7irection soft#are 9 The 'lan%s are the place for the soft#are to repeat #hat you 1ust have spo%en and #hat it is suppose to say for the direction: 7irection8 Program B 7o e6actly li%e the 7irection- soft#are The only differences is that users can use the unlimited #ords feature of the soft#are to ma%e their choices for origins and destinations Program D HelloH .hat is your destination4 Is III correct4 .hat

The purpose of these programs is to test the soft#are product on various #ords of choices The intention is to count ho# many #ords the soft#are product can recogniGe from the users& and ho# this num'er changes #hen the soft#are operates in different environments Please see ta'le a'ove for #ords and commands have 'een used in the programs to test the system Each of the programs that I developed had 'een used -C times 'y each of my testers to test the !oni6 product Each time #as at different locations and environments and it repeated again until the !oni6 product has tested in every possi'ility circumstances The error average then #as calculated 'y ta%ing the mean of the total test runs for this report

DEPT OF ECE, PVKKIT Page >

SPEECH RECOGNITION

). OUTCOMES
).1.M%.ta!e. /a#e 0* -1/a' a'# t-e .*.te/
The training process improves the accuracy of the speech recognition systems $uite $uic%ly if the end,users or the soft#are developers ta%e the time to correct mista%es This is important The voice model created after your Jenrollment( is constantly changed and updated as you correct misinterpretations made 'y the systems This is ho# the programs *learn+ If you do this properly then the accuracy you o'tain #ill improve If you don(t& the accuracy #ill deteriorate There are three types of corrections you need to ma%e #hen you are modifying te6t The first is #hen you cough or get tongue,tied& and the #ord comes out nothing li%e #hat you intended The Speech Recognition systems ma%e an honest 9if not sometimes humorous: attempt to translate your 1um'le The systems al#ays repeat the #ord you 1ust saidK therefore this is also the #ay to detect errors from the systems The solution to fi6 errors is to select the #ord and then say the #ord again properly in place of the mista%e& or 1ust delete the #ord and start over The second circumstance is #hen you simply change your mind 5ou said *this+ 'ut you no# #ant to say *therefore + 5ou can ma%e these changes any time 'ecause the Speech Recognition systems have not made a mista%e 5ou simply change the #ord 9'y typing or 'y voice: In 'oth of these first t#o cases& the Speech Recognition soft#are has not made a mista%e In the third type of correction the soft#are gets it #rong If you say *this+ and the system interpreted the #ord as *dish&+ then you need to go through a correction procedure to ena'le the system to learn from its mista%e 5ou cannot 1ust 'ac%space and try again& tempting though that might 'e .hy is this4 The reason is that modern Speech Recognition is not 'ased on individual #ords& 'ut on small sounds #ithin the #ords Information gained a'out the #ay you say *th+ #ill 'e generaliGed to other #ords #ith a *th + Thus if the system DEPT OF ECE, PVKKIT Page ?

SPEECH RECOGNITION
misinterprets the #ord *this&+ it does not mean the error is restricted to that one #ord If you do not correct the error& other *th+ #ords might 'e affected If you do correct the error& then the ne# information is also spread to other #ords& improving accuracy 5our voice model #ill al#ays 'e getting 'etter or #orse This is something to thin% through thoroughly In each of the programs that have 'een mentioned a'ove e6cept the Fonix Em edded Speech& once you enter the correction process you are offered a list of alternatives in a correction #indo# If the #ord you said is in the list& you simply select the num'er ne6t to it 9'y mouse or voice: If it is not there& you spell the #ord 'y %ey'oard or voice Ne# #ords are added to the active voca'ulary automatically 5ou are automatically as%ed to train the ne# #ord in IBM Via Voice SpeechK it is an option in Dragon Naturally Speaking and Philips FreeSpeech If you are dictating a #ord that is unli%ely to 'e in the program(s voca'ulary& you can enter a spell mode and spell the #ord out This feature is availa'le in all of the programs 3ecause Fonix Em edded Speech program does not re$uire the training process& it is not mentioned here Of the other systems& I have found IBM Via Voice Speech had the 'est correction system The types of mista%es made 'y Fonix Em edded Speech program are $uite interesting In my tests the program nearly al#ays got #ords li%e *erroneously+ or *spo%esperson+ correct These larger #ords are distinctive and therefore easier for the system to recogniGe 3ut it had pro'lems distinguishing *or+ and *all&+ and it nearly al#ays interpreted *or ordered+ as 1ust *ordered&+ perhaps assuming a stutter at the start of the #ord I #as disappointed that in very fe# tests #as the phrase *t#o invoices+ interpreted correctly& in spite of the plural noun I almost al#ays got *to invoices + This also tells the users to pronounce #ord 9s: correctly or the system #ill al#ays end up #ith the error all the time

).2.Ba.el%'e 2w%'#&w. 1,3 4a'.3 "a#%&5e'te"ta%'/e't .*.te/3 a'# w%'#.-%el# w%,e". &446
Since the goal for collecting the data #as to ma%e it as realistic as possi'le& the testing conditions #ere some#hat varia'le and reflected #hat an untrained population DEPT OF ECE, PVKKIT Page @

SPEECH RECOGNITION
of testers or circumstances might produce !or instance& I had to pre,recorded the siren sound of emergency vehicles such as police vehicles)fire truc%)am'ulance& saved onto a FHS tape& ma%e several duplications and play them in a closed door room #ith the volume on loud enough to ma%e it sound li%e an emergency vehicle running 'y& instead of sitting in a car for hours to #ait for the emergency vehicle to run 'y 7oing it this #ay not only saved me time to test the !oni6 product 'ut also provided more accuracy for the results !rom my e6perience& it is also not enough time to test the soft#are product if #e are luc%y enough to catch the emergency vehicle running 'y It ta%es only 8,E seconds for the emergency vehicle to run a#ay& 'ut it ta%es a lot longer than 8,E seconds to test the !oni6 product .ith the #indo#s up& fans& radio& and #indshield #ipers off& #e had an almost $uiet environment inside the car at C 0 P H 9the engine idling: Each tester too% turn to test the !oni6 product #ith several different programs that #ere listed a'ove #ith the engine is running in the 'ac%ground The noise coming from the engine at first 'othered the programs #hen #e used the 'uilt,in microphone on the note'oo% Ho#ever& the errors #ere reduced to almost minimal #hen #e plugged the 2rray 0icrophone 'ac% in The errors this time #ere caused 'y the accents of the testers only 3elo# are the graphs to demonstrate the errors I noted #ith and #ithout the 2rray 0icrophone The errors on the graphs #ere the average of the num'er of testing the programs

DEPT OF ECE, PVKKIT Page -C

SPEECH RECOGNITION
American Indian Chinese 2 1.8 Vietnamese 1.6 1.4 1.2 Errors 1 0.8 0.6 0.4 0.2 0 French

With Array Microphone

3 Programs

DEPT OF ECE, PVKKIT Page --

SPEECH RECOGNITION
American Indian Chinese French 6 Vietnamese 5 4 Errors 3 2 1 0 1 2 3 Programs
!IG/B 8 #ith and #ithout array microphone I then decided to have my testers test the product #hile the car #as running at t#o different speeds& EC and BB 0 P H & #ith the same 'aseline conditions since these are the standard speed on streets in Anited States The 'uilt,in microphone on the note'oo% #as useless this time again 'ecause the note'oo% #as placed too far a#ay from the tester& #ho #as 'ehind the #heel The noise from the engine #as louder and the testers 9also the drivers: had to pay attention to the road so #e decided to use the 2rray 0icrophone from no# on to test the !oni6 product only .ith the 2rray 0icrophone located on the driver(s visor& #e #ere $uite surprised #ith the results that #e o'tained after several se$uences of testing The error percentage #as very lo#& under 8C L compared to #hat I #as predicting 'efore #e made the tests Ho#ever& #e had to turn the volume of microphone on the note'oo% up #hile #e #ere driving on free#ays at BB 0 P H K this #as due to the noise coming from the tires under the car as #ell as the noise from the engine 7espite this& I #as very pleased #ith the results

Without Array Microphone

DEPT OF ECE, PVKKIT Page -8

SPEECH RECOGNITION
The car(s entertainment system #as an issue #hen #e tried to turn its volume to the ma6imum level .e hardly heard each other tal%ing& and to test the product #e had to put the 2rray 0icrophone very close to the testers( mouths in order to have the microphone detects their #ords The product could not response 'ac% 'ecause almost no'ody could say anything clear enough #hile they screamed into the microphone and the music #as playing loudly in the 'ac%ground This #as 1ust our e6periment to see ho# the !oni6 product reacts in a circumstance li%e that Ho#ever in reality& no one #ould try to use the Speech Recognition devices #ith their entertainment system turned on ma6imum If anyone intends to try it& the only outcome that they #ill get is the failure to use the device or program Other than that& #e had no pro'lem testing the product #ith the volume of the entertainment system at its normal level !or our curiosity& #e even rolled the #indo# on the driver side do#n and left the music playing in the 'ac%ground This com'ination of noises did not ma%e a 'ig effect on our testing results The testers only had to spea% their commands and #ords a little 'it louder than their normal speech

).7.D"%$e"8. w%'#&w #&w' 2'&%.e. 4"&/ .t"eet.3 t%"e.3 a'# &t-e" $e-% le.6
.ith the #indo# on the driver side rolled do#n& #e had completely different results for our test process Not only the noises from tires and engine 'othered the soft#are& the noise from streets and from other vehicles running 'y really did trou'le the soft#are The volume of the microphone on the note'oo% had to 'e turned up almost to ma6imum and the testers almost screamed their responses and commands in order to test the soft#are #hen #e #ere on free#ays Sometimes& #e found it impossi'le to test the product& especially at 'usiest commute times of the day Ho#ever& #e had fe#er distur'ances #hen #e #ere on city streets or at night even on free#ays #hen the streets #ere not cro#ded and the free#ays #ere empty

DEPT OF ECE, PVKKIT Page -E

SPEECH RECOGNITION
To eliminate the pro'lems #e had on the free#ays during daylight& I used the tape of the siren sounds from emergency vehicles and played them in a closed door room to get more accurate results as mentioned a'ove 2lso& #ith the same techni$ue& I created a noisy scene in the same location e6actly li%e #e found on streets at 'usiest commute times #here all %ind of noises e6isted 7ue to these *creative techni$ues+& my testers and I did not have to spend time outside 'ut 1ust 'eing inside the closed door room and #e still could o'tain the same results #ith even more accuracy The graphs 'elo# sho# the error percentage o'tained from testing the !oni6 product in the car #ith #indo# on driver side rolled do#n in daylight versus at night The car #as in motion at speed of EC and BB 0 P H at various different locations in the environment The error percentages #ere the average of the num'er of testing the products

DEPT OF ECE, PVKKIT Page -D

SPEECH RECOGNITION
American Indian Chinese French Vietnamese

Daylight
8 7 6 5

Errors 4 3 2 1 0 1 2 3 Programs 4 5

DEPT OF ECE, PVKKIT Page -B

SPEECH RECOGNITION
American Indian Chinese French Vietnamese

Night time
4 3.5 3 2.5 2 1.5 1 0.5 0 1 2 3 Programs 4 5

Errors

Figure5.3. Daylight time vs. night time"

9. E'$%"&'/e'tal C&'t"&l. 2Fa'.3 Heate"3 A5C3 a'# :%'#.-%el# w%,e". &'6


.e had t#o sets of testing in this category The soft#are product #as tested #hen the heater or 2)C #ere on& then it #as tested #ith the #indshield #ipers on and the #ater hose pouring #ater on the car heavily enough to ma%e it seem li%e it is raining outside !irst& the noises o'tained #ith the heater on #ere the noises coming from the fans and the engine .e did not have any pro'lems testing the product #hile the engine #as idling Only fe# pro'lems arose #hen the car #as in motion The noises from the fans and the engine #hen the car #as in motion caused a very small num'er of pro'lems 0y #ild guess #asK my testers #ill cause most pro'lems this time They #ill 'e distracted 'y the temperature and the noise from the fans in the car Therefore& DEPT OF ECE, PVKKIT Page -=

SPEECH RECOGNITION
sometimes they could not hear the $uestions the programs as%ed and they often #ill forget #hat they need to say in response as #ell .hat happened 1ust li%e #hat I predicted a'ove 'ecause the result #as $uite different #hen #e turned the heater off and turned the 2)C on .ith the same type of noise coming from the fans and the engine& 'ut the temperature in the car cooler& and my testers #ere a lot calmed They did not ma%e the same mista%es that they made #hen the heater #as on This confirmed my predication Thus& the noises from the fans #hen the heater or 2)C turned on do not affect the performance of the system at all Only the temperature in the car #hen the heater is on ma%es minor pro'lems for the drivers Secondly& #e sprayed #ater from a hose onto the #indshield of the car and turned the #ipers on to create a scene similar to rain on the outside .e placed the microphone on the dash near to the #indshield to see if noise coming from the #indshield #ipers does or does not affect the performance of the !oni6 product or not It turned out the 2rray 0icrophone #as really a 'est choice to purchase to use #ith the !oni6 product 0y testers did not have to raise their voices #hile testing the soft#are product 'ecause of the noise of the #ipers This proved that the noises of #ater pouring heavily on the #indshield and the noise of the #ipers did not affect the performance of the product at all 3elo# are the graphs containing the error percentages I o'tained from 'oth sets of testingK the testing set #ith heater and 2)C turned on and the testing set #ith #ater pouring heavily on the #indshield and the #ipers turned on

DEPT OF ECE, PVKKIT Page ->

SPEECH RECOGNITION
American Indian Chinese French 4 Vietnamese 3.5 3 2.5 Errors 2 1.5 1 0.5 0 1 2 3 Programs 4 5

Heater and A/C on (In turn)

American Indian Chinese French Vietnamese

Water pouring on indshield and ipers on 2 1.5 1 0.5 0

Errors

3 Programs

DEPT OF ECE, PVKKIT Page -?

SPEECH RECOGNITION

Figure6. Heater and A/C turned on in a car (in turn) vs. Water ouring on !indshield and the !i ers on. "he error num#ers on the gra h on the l!erea&erage o' ho( many times my testers made mistakes (hile testing the product(iththe heater and the )*+ turned on in turn" In contrast, the error num ers thegraphon the right (ere an approximation o' ho( thethe(ater eingpoured hea&ily onto the (indshield"

9.1.C"%t% al C&'t"&l. 2C"1%.e C&'t"&l S,ee# Sett%';3 E/e";e' * l%;-t.3 Hea#l%;-t.3 $a"%et* &4 #%44e"e't .,ee#. a'# l& at%&' &4 .,ea!e".
I had my testers operate the car normally #hile testing the product at the same time That is they used signal lights to change lanes& they used the cruise control speed settings& they ad1usted the reflector mirrors& they drove at different speeds and they even turned the lights in the car on to pretend that they #anted to loo% for something on the passenger seat The noise from the emergency lights #hen they #ere turned on really did not have any affect on the performance of the product at all nor did the cruise control speed settings and the variety of different speeds 0oreover& #e #ere very pleased #ith the results that #e got from testing the product in this category This #as the most en1oya'le set of tests that #e had from the 'eginning 2dditionally& the product #as tested under very cold temperatures 9EC !ahrenheit degree outside: and very #arm temperatures 9>C !ahrenheit degree inside the car: In 'oth conditions& I did not see any 'ig pro'lems caused 'y the product The only pro'lems that I recorded #ere caused 'y the testers& especially #hen it #as too #arm inside the car The location of spea%ers #as mostly on the passengers( seat& 'ecause of the limitation of the #ire of the microphone connected to the note'oo% The testers did not have any trou'le hearing the programs and #ith this result& it #ould 'e a lot 'etter if they can hear the programs from the 'uilt,in spea%ers in the car

DEPT OF ECE, PVKKIT Page -@

SPEECH RECOGNITION

> Errors
Ta'le 8 presents the amount of errors committed 'y each tester during each testing process as o'served 'y me Ta0le 2 M Errors Committed 'y Asers 7uring Each E6ercise
De. "%,t%&' &4 E<e" %.e Te.te" ) Complete Ne# Aser Setup Complete Enrollment)Training 7ictate Three Samples& Three Times T&tal E""&". E 8 E = 8 = > 8 = 8 1+ E D = E 8 Te.te" 1 Te.te" 2 Te.te" 7 Te.te"4

Ta'le E depicts the accuracy rate of each sample Each tester uses the sample to test the soft#are three times and the soft#are product responded 'ac% in the patterns that #ere preset This purpose is 1ust to see in a short time& ho# many errors these testers #ould ma%e I then compared the original te6t #ith the transcri'ed te6t and calculated the num'er of errors .ords& sentences and digits made up each sample 7igits #ere not a concern Each sample consist of a num'er of #ords& varia'les or digits/ Sample One had an unlimited set of digits& si6 #ords and fourteen charactersK Sample T#o had t#elve digits& thirty #ords& and unlimited charactersK Sample Three had fifty eight #ords& and unlimited charactersK Sample !our had one hundred five #ords and unlimited digitsK and Sample !ive has unlimited #ords and unlimited digits Only the first three samples have 'een used to test the product and the results conducted in the Ta'le E 'elo# 1ust to sho# the accuracy of the soft#are product The focus of this study #as to measure the a'ility of the soft#are product& and to use it in the automo'ile for communication purposes& therefore& accurate transcription of #ords and sentences is the most important varia'le The e6periment #ith the same type of programs and testers #as conducted in a $uiet room #here there is no type of noise a'solutely It #as $uite a pleasant DEPT OF ECE, PVKKIT Page 8C

SPEECH RECOGNITION
e6periment 'ecause the system could get almost every #ord that the testers said& if they said it clear and loud enough 2t a time& the system could not response #hen I spo%e #ords very little intentionally #ith the microphone placed a'out -C feet a#ay 3ecause there #as almost no error produced in this e6periment& therefore I have included no ta'le to sho# here Ta0le 7 M Errors in Transcription of Samples

T"a'. "%,t%&' E""&". 0* Sa/,le A 1"a * Rate 4&" Sa/ ,le. Sample One 9unlimited set of digits& si6 #ords and fourteen characters: Transcription Errors Reading@E Transcription Errors Reading 8 @Transcription Errors Reading E @E A$e"a;e A 1"a * Rate 4&" U.e" %' Pe" e't 2?6 >7 Sample T#o 9t#elve digits& thirty #ords& and unlimited characters: Transcription Errors Reading >C Transcription Errors Reading 8 >C Transcription Errors Reading E >? E -C @ 8 = -E @ D 8 E -8 ? -C 8 >9 @= >7 >9 1++ 8 C C C C 8 C C 8 C C Te.te" 1 Te.te" 2 Te.te" 7 Te.te" 4 Te.te" )

DEPT OF ECE, PVKKIT Page 8-

SPEECH RECOGNITION
A$e"a;e A 1"a * Rate 4&" U.e" %' Pe" e't 2?6 @7 Sample Three 9fifty eight #ords& and unlimited characters: Transcription Errors Reading ?> Transcription Errors Reading 8 ?? Transcription Errors Reading E @C A$e"a;e A 1"a * Rate 4&" U.e" %' Pe" e't 2?6 == >1 @9 => >+ >) -B D = 8 B -B = B D @ -8 @ > 8 =7 )1 92 @@ >7

The designers of Fonix Em eded Speech claim the average accuracy rate of the product is @> L and a'ove This high rate of accuracy #as esta'lished #hen the testers completed the entire enrollment process 9training: 2s previously mentioned& time did not allo# each tester to complete the entire enrollment process Ho#ever& even #ith only the first level of the enrollment completed the product scored some impressive accuracy rates Sample One set an average accuracy rate of @EL for the five testing testers Sample T#o did not fare as #ell The voice $uality of tester Num'er T#o and Three may have played a part in reduction of accuracy of Sample T#o Tester Num'er T#o(s voice #as *'oomy+ and lac%ed *space+ in 'et#een #ords Tester Num'er Three had speech impediments and rounded off many consonants #hile spea%ing Sample Three scored an accuracy rate of ??L #ith tester Num'er T#o still scoring #ell 'elo# all testers in the group Tester Num'er !ive continually scored very high in all samples along #ith tester Num'er One Not only #as his voice mature and #ithout any speech impediments& 'ut 'ecause he is the author of the samples as #ell as this document 3ecause of this& I had more e6perience than other testers #hile using the samples during the testing process Ta'le E supports this fact Sample One improved from DEPT OF ECE, PVKKIT Page 88

SPEECH RECOGNITION
@-L on the Second Reading to @EL on the Third Reading Sample T#o improved from >CL on the !irst Reading to >?L on the Third Reading Sample Three improved on the !irst Reading from ?>L to @CL on the Third Reading Thin% aloud o'servation and posttest intervie#s offer a first person vie# of the testers( e6periences using the product Posttest intervie# data for this usa'ility study indicate testers had fun using Fonix Em eded Speech" Some testers #ere e6pecting the soft#are to operate in one #ay and #ere surprised #hen the soft#are operated in $uite a different #ay !or e6ample& one test assumed the conversion to te6t #ould 'e instantaneous !rom a technological standpoint& the conversion process may ta%e a fe# seconds& 'ut the a'ility to do a conversion in fractions of a second is unrealistic Testers #ere not trou'led #ith the comple6ity and the loo% of the user interface Screen colors #ere easy on the eyes and the te6t #as easy to read for all users Ta0le 4 - Results of Post,Asa'ility Evaluation Survey Te.te".A - 8 E D B This is a soft#are evaluation $uestionnaire for Fonix Em eded Speech DD,

@.1. D%"e t%&'.


Please fill in the leftmost circle if you strongly agree #ith the statement 0ar% the ne6t circle if you agree #ith the statement 0ar% the middle circle if you are undecided 0ar% the fourth circle if you disagree& and mar% fifth or last circle if you strongly disagree #ith the statement

Strongly 2gree Andecided 7isagree Strongly 2gree 7isagree - I #ould recommend this soft#are to my colleagues C 8 The instructions and prompts are helpful C E Nearning to use this soft#are is difficult C C E E C C E 8 C

DEPT OF ECE, PVKKIT Page 8E

SPEECH RECOGNITION
D I sometimes do not %no# #hat to do ne6t C B I en1oyed my session #ith this soft#are C = I find the help information useful C > It ta%es too long to learn the soft#are commands C ? .or%ing #ith this soft#are is satisfying C @ I feel in command of this soft#are #hen I use it C -C I thin% this soft#are is inconsistent C -- This soft#are is a#%#ard to use #hen I #ant to do something that is not standard C -8 I can perform tas%s in a straightfor#ard manner using this soft#are C -E Asing this soft#are is frustrating C -D It is o'vious that the soft#are #as designed #ith the user(s needs in mind C -B 2t times& I have felt tense using this soft#are C -= Nearning ho# to use ne# functions is difficult C -> It is easy to ma%e the soft#are do e6actly #hat you #ant C -? This soft#are is a#%#ard C -@ I have to loo% for assistance most times #hen I use this soft#are C 8C The soft#are has not al#ays done #hat I #as e6pecting C 8 C E C C C B C 8 E C C C E 8 C E C 8 C 8 8 C C D C E 8 C C 8 8 C C D C 8 8 C E C C D C E 8 C 8 E C C C 8 8

DEPT OF ECE, PVKKIT Page 8D

SPEECH RECOGNITION

Survey data presented in Ta'le D also supports the posttest intervie# data Testers agreed that the 'asic soft#are operation and access to the enrollment process is straightfor#ard and easy to perform product to others& #hile t#o remain

? C&' l1.%&'
The designers of Fonix Em edded Speech DD, ma%e fe# assumptions #ith respect to their users In my opinion& the soft#are does an admira'le 1o' of addressing the usa'ility needs of a #ide audience It is necessary to insure designers develop soft#are products #ith universal usa'ility in mind It is also my opinion that the Fonix Em edded Speech DD, designers created a generally solid soft#are product that almost anyone can use #ith success The soft#are recorded and analyGed each test su'1ectOs voice successfully 2fter#ards& each user could dictate and the PC transcri'ed the user(s dictation #ith relative accuracy The voice to te6t transcription application is a proven feature of Fonix Em edded Speech DD," Ho#ever& using this soft#are application as a communication device for automo'ile is yet unproven& at least on my opinion !irst& I have not figured out a #ay to input e6ternal files into a program that needs storage for its data 2s an e6ample& in Samples !our and !ive& the programs need to have places to store the origins and destinations input 'y the users so that the program can use it later on for the direction 3ecause of its lac%ing of features& I had to 'e creative and #or% my #ay around it to find another alternative to ma%e the Samples #or% the #ay I #anted them to 0ay'e I have 'een una'le to figure out the 'est features of the soft#are product yet and the features that I had loo%ed are still hidden #ithin the pac%age some#here The 'uilt,in microphone on my note'oo% did not #or% #ell #ith the soft#are product in noisy environments The product needed assistance from the 2rray 0icrophone made 'y 2ndreaelectronic Company 7espite the siGe of the 2rray 0icrophone from 2ndreaelectronic Company& its performance is #ell 'eyond perfect It could eliminate noise to minimal levels and helps a lot #ith the soft#are product& especially #hen users #ant to use it in a very DEPT OF ECE, PVKKIT Page 8B

SPEECH RECOGNITION
noisy environment such as a train station& free#ay& near'y hospital& or 'usy commuter roads 2 suggestion for further research is to select users #ho actually #ould li%e to use Speech Recognition for their needs This #ay& researchers could get a more accurate result list to ma%e a 'etter device for these people 3ecause the testers in this document did not have any intention to use Speech Recognition& at least for time 'eing& for any of their purposes& the results of their opinions on the soft#are product #ere very general and 'oard This #ould ma%e a lot of improvement later if they decide to 'uild a device 'ased on the results of this document right a#ay Overall& the Fonix Em edded Speech DD, is one of the most su'stantial and easy to use programs on the mar%et today Soft#are developers #ho have very little %no#ledge in programming languages #ould 'e a'le to use this soft#are product #ithout any frustration or hesitation Good solutions are no# availa'le for speech recognition 3ut the main varia'le no# is M the user If you are prepared to correct errors& so that your speech model #ill improve& then you #ill most li%ely 'enefit from increased accuracy If you desire 9or need: to use speech recognition& then your initiative #ill carry you through the early stages of training and lo#er results& #here the faint,hearted #ill 'e tempted to give up If you are prepared to discipline your speech& to give the speech systems a fair chance& then your results are li%ely to 'e re#arding It all 'oils do#n to #hether or not you really #ant to create te6t this #ay If you do& then #ith some care in the selection of e$uipment& speech recognition is ready for you This usa'ility evaluation is one small& yet important step in the process of verifying the value of computing technology& specifically speech recognition soft#are& in improving current needs of the mar%et today This usa'ility evaluation supports the soft#are manufacturer claims that the soft#are is easy to use and stipulates that #hoever uses the product can 'ecome productive in a short period of time If further evaluation and research support these claims& integration of speech recognition soft#are into today is mar%et should 'e seriously considered

DEPT OF ECE, PVKKIT Page 8=

SPEECH RECOGNITION

>. Re4e"e' e l%.t

-" )ppleton, E" .-//0#" 1Put usa ility to the test2" Datamation, 0/.-3#, 4-54%" %" 6aco sen, N", 7ert$um, M", 8 6ohn, B" .-//9#" 1!he e&aluator e''ect in usa ility tests2" SI:+7I ; )+M Special Interest :roup on +omputer57uman Interaction, %<<5 %<4" 0" =ecero', )", 8 Paterno, F" .-//9#" 1)utomatic support 'or usa ility e&aluation2" %3.->#, 9405999" IEEE !ransactions on So't(are Engineering, %3.->#, 9405999" 3" :ales, M" 6" F", and ?oung, S", 1)n Impro&ed )pproach to the 7idden Marko& Model Decomposition o' Speech and Noise2, I+)SSP5/%, pp" I5%005I5%04, -//%" <" )cero, )", and Stern, @" M", 1En&ironmental @o ustness in )utomatic Speech @ecognition2, I+)SSP5/>, pp" 93/59<%, -//>" Dal Degan, N", and Prati, +", 1)coustic Noise )nalysis and Speech Enhancement !echniAues 'or Mo ile @adio )pplications2, Signal Processing, $5% 305<4, -/99"

DEPT OF ECE, PVKKIT Page 8>

Вам также может понравиться