S13 Using Speech

M13: Using Speech
Andy Wigley | Microsoft Technical Evangelist

Rob Tiffany | Microsoft Enterprise Mobility Strategist
Target Agenda | Day 1
Module and Topic | 10-minute breaks after each session / 60-minute meal break
Planned
Duration
1a - Introducing Windows Phone 8 Application Development | Part 1 50:00
1b - Introducing Windows Phone 8 Application Development | Part 2 50:00
2 - Designing Windows Phone Apps 50:00
3 - Building Windows Phone Apps 50:00
4 - Files and Storage on Windows Phone 8 50:00
Meal Break | 60-minutes 60:00
5 - Windows Phone 8 Application Lifecycle 50:00
6 - Background Agents 25:00
7 - Tiles and Lock Screen Notifications 25:00
8 - Push Notifications 30:00
9 - Using Phone Resources on Windows Phone 8 50:00
Target Agenda | Day 2
Module and Topic | 10-minute breaks after each session / 60-minute meal break
Planned
Duration
10 - App to App Communication 35:00
11 - Network Communication on Windows Phone 8 50:00
12 - Proximity Sensors and Bluetooth 35:00
13 - Speech Input on Windows Phone 8 35:00
14 - Maps and Location on Windows Phone 8 35:00
15 - Wallet Support 25:00
16 - In App Purchasing 25:00
Meal Break | 60-minutes 60:00
17 - The Windows Phone Store 50:00
18 - Enterprise Applications in Windows Phone 8: Architecture and Publishing 50:00
19 - Windows 8 and Windows Phone 8 Cross Platform Development 50:00
20 Mobile Web 50:00
Module Agenda
Speech on Windows Phone 8
Speech synthesis
Controlling applications using speech
Voice command definition files
Building conversations
Selecting application entry points
Simple speech input
Speech input and grammars
Using Grammar Lists
Speech on
Windows Phone 8
Windows Phone Speech Support
Windows Phone 7.x had voice support built into the operating system
Programs and phone features could be started by voice commands e.g Start MyApp
Incoming SMS messages could be read to the user
The user could compose and send SMS messages
Windows 8 builds on this to allow applications to make use of speech
Applications can speak messages using the Speech Synthesis feature
Applications can be started and given commands
Applications can accept commands using voice input
Speech recognition requires an internet connection, but Speech Synthesis does not

Speech Synthesis
Enabling Speech Synthesis
If an application wishes to use speech output the
ID_CAP_SPEECH_RECOGNITION capability must
be enabled in WMAppManifest.xml

The application can also reference the Synthesis
namespace
using Windows.Phone.Speech.Synthesis;
Simple Speech
The SpeechSynthesizer class provides a simple way to produce speech
The SpeakTextAsync method speaks the content of the string using the default voice
Note that the method is an asynchronous one, so the calling method must use the
async modifier
Speech output does not require a network connection
async void CheeseLiker()
{
SpeechSynthesizer synth = new SpeechSynthesizer();

await synth.SpeakTextAsync("I like cheese.");
}

Selecting a language
The default speaking voice is selected automatically from the locale set for the phone
The InstalledVoices class provides a list of all the voices available on the phone
The above code selects a French voice
// Query for a voice that speaks French.
var frenchVoices = from voice in InstalledVoices.All
where voice.Language == "fr-FR"
select voice;

// Set the voice as identified by the query.
synth.SetVoice(frenchVoices.ElementAt(0));

Demo 1: Voice
Selection
Speech Synthesis Markup Language
You can use Speech Synthesis Markup Language (SSML) to control the spoken output
Change the voice, pitch, rate, volume, pronunciation and other characteristics
Also allows the inclusion of audio files into the spoken output
You can also use the Speech synthesizer to speak the contents of a file
<?xml version="1.0" encoding="ISO-8859-1"?>
<speak version="1.0"
xmlns=http://www.w3.org/2001/10/synthesis xml:lang="en-US">
<p> Your <say-as interpret-as="ordinal">1st</say-as> request was for
<say-as interpret-as="cardinal">1</say-as> room on
<say-as interpret-as="date" format="mdy">10/19/2010</say-as> ,
arriving at <say-as interpret-as="time" format="hms12">12:35pm</say-as>.
</p>
</speak>
Controlling Applications
using Voice Commands
Application Launching using Voice command
The Voice Command feature of Windows Phone 7 allowed users to start applications
In Windows Phone 8 the feature has been expanded to allow the user to request data
from the application in the start command
The data will allow a particular application page to be selected when the program starts
and can also pass request information to that page
To start using Voice Commands you must Create a Voice Command Definition (VCD) file
that defines all the spoken commands
The application then calls a method to register the words and phrases the first time
it is run
The Fortune Teller Program
The Fortune Teller program will tell
your future
You can ask it questions and it will
display replies
It could also speak them
Some of the spoken commands activate
different pages of the application and
others are processed by the application
when it starts running
<CommandPrefix> Fortune Teller </CommandPrefix>
<Example> Will I find money </Example>
<Command Name="showMoney">
<ListenFor> [Will I find] {futureMoney} </ListenFor>
<Feedback> Showing {futureMoney} </Feedback>
<Navigate Target="/money.xaml"/>
</Command>
<PhraseList Label="futureMoney">
<Item> money </Item>
<Item> riches </Item>
<Item> gold </Item>
</PhraseList>
The Voice Command Definition (VCD) file
This is the money question: Fortune Teller Will I find money
</Command>
<Item> gold </Item>
</PhraseList>

This is the phrase the user
says to trigger the
command
All of the Fortune Teller
commands start with this
phrase
</Command>
<Item> gold </Item>
</PhraseList>

This is example text that
will be displayed by the
help for this app as an
example of the commands
the app supports
</Command>
<Item> gold </Item>
</PhraseList>

This is the command
name
This can be obtained from
the URL by the application
when it starts
</Command>
<Item> gold </Item>
</PhraseList>

This is the example for this
specific command
</Command>
<Item> gold </Item>
</PhraseList>

This is the trigger phrase for
this command
It can be a sequence of
words
The user must prefix this
sequence with the words
Fortune Teller
</Command>
<Item> gold </Item>
</PhraseList>

This is the phraselist for the
command
The user can say any of the
words in the phraselist to
match this command
The application can
determine the phrase used
The phraselist can be
changed by the application
dynamically
</Command>
<Item> gold </Item>
</PhraseList>

This is the spoken feedback
from the command
The feedback will insert the
phrase item used to
activate the command
</Command>
<Item> gold </Item>
</PhraseList>

This is the url for the page
to be activated by the
command
Commands can go to
different pages, or all go to
MainPage.xaml if required

</Command>
<Item> gold </Item>
</PhraseList>

These are the phrases that
can be used at the end of
the command
The application can modify
the phrase list of a
command dynamically
It could give movie times
for films by name

Installing a Voice Command Definition (VCD) file
The VCD file can be loaded from the application or from any URI
In this case it is just a file that has been added to the project and marked as Content
The VCD can also be changed by the application when it is running
The voice commands for an application are loaded into the voice command service when
the application runs
The application must run at least once to configure the voice commands
async void setupVoiceCommands()
{
await VoiceCommandService.InstallCommandSetsFromFileAsync(
new Uri("ms-appx:///VCDCommands.xml", UriKind.RelativeOrAbsolute));
}
Launching Your App With a Voice Command
If the user now presses and holds the Windows button, and says:
Fortune Teller, Will I find gold?
the Phone displays Showing gold
It then launches your app and navigates to the page associated with this command, which is
/Money.xaml
The query string passed to the page looks like this:
"/?voiceCommandName=showMoney&futureMoney=gold&reco=Fortune%20Teller%Will%20I%20find%20gold"
Command
Name
Phaselist
Name
Recognized
phrase
Whole phrase as it
was recognized
Handling Voice Commands
This code runs in the OnNavigatedTo method of a target page
Can also check for the voice command phrase that was used
if (e.NavigationMode == System.Windows.Navigation.NavigationMode.New) {
if (NavigationContext.QueryString.ContainsKey("voiceCommandName")) {
string command = NavigationContext.QueryString["voiceCommandName"];
switch command) {
case "tellJoke":
messageTextBlock.Text = "Insert really funny joke here";
break;
// Add cases for other commands.
default:
messageTextBlock.Text = "Sorry, what you said makes no sense.";
break;
}
}
}

Identifying phrases
The navigation context can be queried to determine the phrase used to trigger the navigation
In this case the program is selecting between the phrase used in the riches question
<Item> gold </Item>
</PhraseList>

string moneyPhrase = NavigationContext.QueryString["futureMoney"];
Demo 2: Fortune Teller
Modifying the phrase list
An application can modify a phrase list when it is running
It cannot add new commands however
This would allow a program to implement behaviours such as:
Movie Planner tell me showings for Batman

VoiceCommandSet fortuneVcs = VoiceCommandService.InstalledCommandSets["en-US"];

await fortuneVcs.UpdatePhraseListAsync("futureMoney",
new string[] { "money", "cash", "wonga", "spondoolicks" });

Simple Speech Input
Recognizing Free Speech
A Windows Phone application can recognise words and phrases
and pass them to your program
From my experiments it seems quite reliable
Note that a network connection is required for this feature
Your application can just use the speech string directly
The standard Listening interface is displayed over
your application
Simple Speech Recognition
The above method checks for a successful response
By default the system uses the language settings on the Phone
SpeechRecognizerUI recoWithUI;

async private void ListenButton_Click(object sender, RoutedEventArgs e)
{
this.recoWithUI = new SpeechRecognizerUI();

SpeechRecognitionUIResult recoResult =
await recoWithUI.RecognizeWithUIAsync();
if ( recoResult.ResultStatus == SpeechRecognitionUIStatus.Succeeded )
MessageBox.Show(string.Format("You said {0}.",
recoResult.RecognitionResult.Text));
}
Handling Errors
An application can bind to events which indicate problems with the audio input
There is also an event fired when the state of the capture changes
recoWithUI.Recognizer.AudioProblemOccurred +=Recognizer_AudioProblemOccurred;
recoWithUI.Recognizer.AudioCaptureStateChanged +=
Recognizer_AudioCaptureStateChanged;
...

void Recognizer_AudioProblemOccurred(SpeechRecognizer sender,
SpeechAudioProblemOccurredEventArgs args)
{
MessageBox.Show("PLease speak more clearly");
}
Demo 3:
In App Speech Recognizer
Profanity
Words that are recognised as profanities are not displayed in the response from a
recognizer command
The speech system will also not repeat them
They are enclosed in <Profanity> </Profanity> when supplied to the program that
receives the speech data
Review
Applications in Windows Phone 8 can use speech generation and recognition to interact
with users
Applications can produce speech output from text files which can be marked up with
Speech Synthesis Markup Language (SSML) to include sound files
Applications can be started and provided with initial commands by registering a Voice
Command Definition File with the Windows Phone
The commands can be picked up when a page is loaded, or the commands specify a
particular page to load
An application can modify the phrase part of a command to change the
activation commands
Applications can recognise speech using complex grammars or simple word lists

The information herein is for informational
purposes only an represents the current view of
Microsoft Corporation as of the date of this
presentation. Because Microsoft must respond
to changing market conditions, it should not be
interpreted to be a commitment on the part of
Microsoft, and Microsoft cannot guarantee the
accuracy of any information provided after the
date of this presentation.
2012 Microsoft Corporation.
All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION
IN THIS PRESENTATION.

S13 Using Speech

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

S13 Using Speech

Загружено:

Авторское право:

Доступные форматы

M13: Using Speech

Andy Wigley | Microsoft Technical Evangelist

Вам также может понравиться