W3C Voice Browser Activity


VoiceXML - processing

VoiceXML - example

Figure: VoiceXML example

 <?xml version="1.0" encoding="UTF-8"?>
 <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml">
  <form id="pizza-mixed">
   <grammar src="pizza.grxml"/>
   <initial name="pizzaall">
    <prompt>Welcome to FI pizzeria</prompt>
    <nomatch count="2"><assign name="pizzaall" expr="true"/></nomatch>
    <noinput count="2"><assign name="pizzaall" expr="true"/></noinput>
   <field name="kind">
    <prompt>What kind of pizza do you want?</prompt>
    <nomatch>We have salami, mozzarela and appolo pizza</nomatch>
    <noinput>We have salami, mozzarela and appolo pizza</noinput>
    <grammar src="pizza.grxml#kind"/>
   <field name="topping">
    <prompt>What topping do you want?</prompt>
    <nomatch>We offer ketchup and chilli.</nomatch>
    <noinput>We offer ketchup and chilli.</noinput>
    <grammar src="pizza.grxml#topping"/>
  <field name="drink">
    <prompt>What do you want to drink?</prompt>
    <nomatch>Select one of coke, sprite and water</nomatch>
    <noinput>Select one of coke, sprite and water</noinput>
    <grammar src="pizza.grxml#drink"/>
   <field name="ack">
    <prompt>Did you ordered <value expr="kind"/> pizza with <value
    expr="topping"/> and <value expr="drink"/>?</prompt>
    <grammar src="yesno.grxml"/>
    <if cond="ack=='yes'">
         <prompt>Order submitted</prompt>
         <clear namelist="kind topping drink ack"/>

SRGS (Speech Recognition Grammar Specification)

SRGS - example

Figure: SRGS grammar referenced in the previous VoiceXML example (pizza.grxml)

 <?xml version="1.0" encoding="UTF-8"?>
 <grammar root="mixed" xml:lang="en_US">
  <rule id="mixed">
    <item><ruleref special="GARBAGE"/> <ruleref uri="#kind"/> pizza <ruleref special="GARBAGE"/> <ruleref uri="#topping"/> and <ruleref uri="#drink"/>

  <rule id="kind">



SISR (Semantic Interpretation for Speech Recognition)

SSML (Speech Synthesis Markup Language)

SSML - example of loudness and breaks

Figure: SSML Breaks and loudness control example

 <?xml version="1.0" encoding="utf-8"?>
 <speak version='1.1" xmlns="http://www.w3.org/2001/10/synthesis"
  <prosody volume="loud">
   Dobre rano. <break />
  <prosody volume="default">
   Jak se mate?

SSML - example of intonation modeling

Figure: SSML Intonation modeling

 <speak ...>
  <prosody contour="(0%,50Hz) (75%, +10%) (80%, +20%) (90%,+30%)">
   Mas se dobre?

PLS (Pronunciation Lexicon Specification)

PLS Structure

PLS - example

Figure: PLS pronunciation example

 <?xml version="1.0" encoding="utf-8"?>
 <lexicon version="1.0"
       alphabet="ipa" xml:lang="cs-CZ">
   <phoneme>tʃˈeː ˈer</phoneme>
   <phoneme>tʃˈeskaː rˈepublˌika</phoneme>

Call Control XML

State Chart XML

State Chart XML - Relation to Dialogue

SCXML - Demo

Example 1: Process planing demo

Process state diagram

SCXML - Demo

Example 1: Corresponding SCXML

<?xml version="1.0" encoding="UTF-8"?>
<scxml version="1.0" xmlns="http://www.w3.org/2005/07/scxml">
  <transition target="Created" type="external"/>
 <state id="Created">
  <transition target="Waiting" event="enqueue"/>
 <state id="Waiting">
  <transition target="Running" event="assign"/>
 <state id="Running">
  <transition target="Blocked" event="wait for resource"/>
  <transition target="Waiting" event="timeout"/>
  <transition target="Terminated" event="terminate"/>
 <state id="Blocked">
  <transition target="Waiting" event="resource available"/>
 <final id="Terminated"/>