Considerations for developing VoiceXML in Spanish

Anuncio
Language considerations for developing VoiceXML in Spanish
This section contains information that is specific to Spanish. If you are developing
Spanish voice applications, use the information in this section, instead of the
equivalent US English information sections.
v “Built-in field types and grammars”
v
v
v
v
v
“Predefined events” on page iv
“Built-in commands” on page iv
“Specifying character encoding” on page iv
“Testing built-in field types” on page iv
“SSML elements and attributes” on page v
Built-in field types and grammars
Table 1 shows the built-in field types and grammars for Spanish.
Table 1. Spanish built-in types. Brackets “[]” around a keyword mean that the keyword is optional. The vertical-bar
symbol “|” indicates a choice between two or more keywords.
Element
Implementation details
boolean
Users can say one of the positive responses sí, verdadero, afirmativo, cierto, correcto, exacto, seguro, claro, exactamente,
correctamente, justo, OK, seguramente, claramente, vale, es verdad, está bien, cómo no, and de acuerdo, or one of the negative
responses no, falso, incorrecto, negativo, and inexacto.
Users can also provide DTMF input: 1 is yes, and 2 is no.
Note: The IBM TTS engine is unable to synthesize the return value for this language.
1
Table 1. Spanish built-in types (continued). Brackets “[]” around a keyword mean that the keyword is optional. The
vertical-bar symbol “|” indicates a choice between two or more keywords.
Element
Implementation details
currency
Users can say currency values as a combination of a main currency component (euro, peso, dólar), such as “treinta y siete euros,”
and a cent component (céntimo, centavo), such as “cincuenta centavos.” Users can also say one of these components without the
other. If both components are present, they can be either not separated (as in “veinte euros cincuenta”), or separated by the word
“y” or the word “con” (as in “dos euros con cincuenta”).
A main currency component may be in any of these formats: “cero euros,” “un euro,” “m euros” (where 2 <= m <= 999,999,999)
and “m” (where 1 <= m <= 999,999,999).
A cent component may be in any of these formats: “cero céntimos(centavos),” “un céntimo(centavo),” “n céntimos(centavos)”
(where 1 <= n <= 99).
If both components are present, the cent component can also have the format “n” (where 0 <= n <= 99).
Note that if users just say an integer value “m″ (where 1 <= m <= 999,999,999), this will be interpreted as a number of euros.
Note also that users can also speak two numerical values, where the first is between 0 and 999,999,999, and the second is between
0 and 99, optionally separated by the word y or con. This will be interpreted as a number of euros followed by a number of cents.
The main currency can also be any of the following: peso argentino, peso chileno, peso colombiano, peso cubano, peso
dominicano, peso mexicano, balboa, bolívar, bolíviano, colón, colón costarricense, colón salvadoreño, córdoba, guaraní, lempira,
nuevo sol, and quetzal.
Users can also provide DTMF input using the numbers 0 through 9 and optionally the * key (to indicate a decimal separator), and
must terminate DTMF entry using the # key.
The return value sent is a string in the format UUUddddddddd.cc, where UUU is a currency indicator, as shown below.
Value
PAB
VEB
BOB
SVC
CRC
NIO
USD
EUR
PYG
HNL
GBP
PEN
MXN
ARS
CLP
COP
CUP
DOP
GTQ
ECS
Note: The
date
Currency
balboa
bolívar
boliviano
colón, colón salvadoreño
colón costarricense
córdoba
dólar
euro
guaraní
lempira
libra esterlina
nuevo sol
peso, peso mexicano
peso argentino
peso chileno
peso colombiano
peso cubano
peso dominicano
quetzal
sucre
IBM TTS engine is unable to synthesize the return value for this language.
Users can say a date using days (cardinal numbers 1 to 31, or the ordinal number primero), months, and years, as well as the
words ayer, hoy, and mañana. The year must be between 1900 and 2099.
A full date starts with the day number, followed by the word de preceding the month name, or by the word del preceding the
month number (a cardinal number between 1 and 12). The month is followed by the word de preceding the year (between 1900
and 2099), for example, “siete de junio de dos mil dos.”
Users can also specify a 2-digit year, such as “noventa y cinco.” A 2-digit year will be interpreted as being between 1911 and 2010.
Users can also say the day and month without specifying a year. This will be interpreted as a date in the current year.
Any of these constructs can be preceded by a day of the week, such as “viernes siete de junio de dos mil dos.” However, the day
of the week will be ignored. Thus the specified date will be accepted, whether or not it falls on the specified day.
Users can also provide DTMF input in the form yyyymmdd.
Note: The date grammar does not perform leap year calculations. February 29th is accepted as a valid date regardless of the year.
If desired, your application or servlet can perform the required calculations.
The return value sent is a string in the format yyyymmdd, with the VoiceXML browser returning a ? in any positions omitted in
spoken input.
Note: The IBM TTS engine is unable to synthesize the return value for this language.
2
Table 1. Spanish built-in types (continued). Brackets “[]” around a keyword mean that the keyword is optional. The
vertical-bar symbol “|” indicates a choice between two or more keywords.
Element
Implementation details
digits
Users can say non-negative integer values as strings of individual digits (0 through 9). For example, a user could say “cero uno
dos tres cuatro cinco seis siete ocho nueve.”
Users can also provide DTMF input using the numbers 0 through 9, and must terminate DTMF entry using the # key. The return
value sent is a string of one or more digits. If the result is subsequently used in <say-as> with the interpret-as value vxml:digits, it
will be spoken as a sequence of digits appropriate to the current language. For example, the TTS engine speaks “123456” as “uno
dos tres cuatro cinco seis.”
number
Users can say natural numbers (that is, positive and negative integers, 0, and decimals) from -999,999,999.9999 to 999,999,999.9999.
Users can say the words punto or coma to indicate a decimal separator, menos to indicate a negative number, and más to indicate
a positive number (which is the default).
Users can also provide DTMF input using the numbers 0 through 9 and optionally the * key (to indicate a comma), and must
terminate DTMF entry using the # key. Only positive numbers can be entered using DTMF.
The return value sent is a string of one or more digits, 0 through 9, with a decimal point and a + or - sign as applicable.
Note: The IBM TTS engine is unable to synthesize the return value for this language.
phone
Users can say a telephone number, including the optional word extensión or interno, meaning extension.
In addition to digits (1 to 9), users can use 2-digit numbers and also the numbers 100, 200, 300, 400, 500, 600, 700, 800, and 900 if
they match the end of the telephone number or the end of the extension.
Users can also provide DTMF input using the numbers 0 through 9 and optionally the * key (to represent the word “extension”),
and must terminate DTMF entry using the # key. The return value sent is a string of digits which includes an x if an extension was
specified. However, the IBM TTS engine is unable to synthesize the return value for this language.
time
Users can say a time of day using hours and minutes in either 12-hour clock or 24-hour clock format, as well as the words ahora,
mediodía, and medianoche.
Times can be stated in any of the following formats:
For the 24-hour clock:
Una hora y M minutos
H horas [y] M minutos
H horas y M
La una y M [minutos]
Las H y M [minutos]
La una cero X
Las H cero X
La una
Las H
Las H horas
where 0 <= H <= 23 (but not 1), 0 <= M <= 59 (minuto instead of minutos when M=1),
and 0 <= X <= 9.
For the 12 hour clock:
las H [en punto] [de la mañana | de la tarde | de la noche]
las H y cuarto [de la mañana | de la tarde | de la noche]
las H y media [de la mañana | de la tarde | de la noche]
las H menos cuarto [de la mañana | de la tarde | de la noche]
las H y M (minutos) [de la mañana | de la tarde | de la noche]
las H menos M (minutos) [de la mañana | de la tarde | de la noche]
where 1 <= H <= 12 (la instead of las when H=1)
where 1 <= M <= 29 (minuto instead of minutos when M=1)
[el] mediodía
[la] medianoche
ahora
Users can also provide DTMF input using the numbers 0 through 9.
The return value sent is a string in the format hhmmx, where x is a for a.m., p for p.m., h for 24 hour format or ? if unspecified or
ambiguous. For DTMF input, the return value will always be h or ?, since there is no mechanism for specifying a.m. or p.m.
Note: The IBM TTS engine is unable to synthesize the return value for this language.
Language considerations for developing VoiceXML in Spanish
3
Predefined events
Table 2 shows the Spanish default event-handler messages for the error, help, and
nomatch events.
Table 2. Spanish predefined events and event-handler messages
Event
Default event-handler message
error.badfetch
error.noauthorization
error.semantic
error.unsupported.element
Perdón, hay que salir debido a un error de proceso.
help
Perdón, pero no hay ayuda disponible.
nomatch
Perdón, no se entendió bien.
Built-in commands
Table 3 shows the built-in VoiceXML browser commands for Spanish:
Table 3. Spanish built-in VoiceXML browser commands
Grammar name
Valid user utterances
VoiceXML browser response
Quiet/Cancel
[por favor] silencio | silencio por favor | When barge-in is enabled, stops the current
[por favor] cancelar | cancelar por favor spoken output and waits for further
| [por favor] cancele | cancele por favor instructions from user.
Help
[por favor] ayuda | ayuda por favor
Plays the help event-handler message.
Specifying character encoding
The default character encoding for XML documents is utf8, which has 7-bit ASCII
as a proper subset. Character codes in VoiceXML documents that are greater than
127 (such as special characters è or ä) will therefore be interpreted as the first byte
of a multi-byte utf8 sequence. You can force the XML parser to use another
codepage by specifying the desired encoding as the very first thing in the file:
<?xml version="1.0" encoding="iso-8859-1"?>
Testing built-in field types
Table 4 provides examples of the types of input you might specify when testing a
Spanish voice application that uses the built-in field types.
Table 4. Sample input for Spanish built-in field types
Built-in field type
Sample input
boolean
sí, cierto, exactamente, vale, no, falso
currency
diez mil pesos
nueve euros cincuenta
once dólares con diez centavos
noventa y ocho céntimos de euro
4
Table 4. Sample input for Spanish built-in field types (continued)
Built-in field type
Sample input
date
cinco de marzo
trece de abril de dos mil once
primero de mayo
noviembre del noventa y nueve
seis del seis de dos mil dos
ayer
hoy
mañana
digits
0, 1, 2, 3, 4, 5, 6, 7, 8, 9
number
diez milliones quinientos mil
menos dos coma cinco
más dos coma cinco
phone
siete tres cinco cuatro nueve
cero cero cuatro nueve sesenta y siete treinta y seis trescientos extensión
once once extensión setenta y dos
time
las dos
las siete y media
las nueve y cuarto de la mañana
medianoche
SSML elements and attributes
Table 5 shows the notes about SSML elements and attributes in Spanish. This tables
is applicable to Castilian and Mexican Spanish.
Table 5. Limitations for Spanish SSML elements
Element
Implementation details
<emphasis>
This element is not supported.
<phoneme>
The IPA alphabet is not supported. The IBM alphabet is supported.
The IBM alphabet used in SSML refers to the phonology used by IBM TTS. The following
example shows the US English phonetic pronunciation of “tomato” using the IBM TTS phonetic
alphabet:
<phoneme alphabet="ibm" ph=".0tx.1me.0Fo"> tomato </phoneme>
For more information on IBM SPRs, see the IBM Text-To-Speech SSML Programming Guide.
<prosody>
The pitch, range and rate attributes are not supported.
<say-as>
The interpret-as attribute is only supported with the value ″vxml:digits″. The digits and letters
attributes are also supported.
Language considerations for developing VoiceXML in Spanish
5
Descargar