Curious Mind

I'm starting to become more comfortable with Speech Server, and to have a better understanding of how it works. I think the part of Speech Server that I struggled with the most was the conceptual model. Speech Server is actually a javascript (client-side) object, so much of the logic of a Speech Server application is written in javascript. Then there's the "semantic item" which at this point I envision as a sort of holding area for variable data and as a great way for the server side of an application (database access, etc) to communicate with the client side and speech server.

Speech server requires a grammar for recognizing speech. A grammar is the template that speech must match up against to be recognized, and the fact that such a template exists helps the speech recognition to be more accurate by narrowing down the possibilities. For example, if the grammar specifies that a word must be either donut, muffin or apple, then the speech server knows that the user didn't say "donate" -- more likely that he or she said "donut".

Here's a few tricks that I've discovered:

1) If you want the speech engine to speak a number (1910 as "one nine one zero" instead of "one thousand nine hundred ten", use this javascript: sMyNumber.split("").join(",");

2) To bypass a prompt database (prerecorded speech) and force use of the speech synthesizer, use the <peml:tts> tag. For example, if the prompt is: "You have selected Cantelever Street </peml:tts>. Is that correct?"

3) To react differently the second (or third) time through a prompt, you can use a javascript function like the following:

var miRepeat=0;
function GetTimesThrough()
{
var iReturnValue=miRepeat;
miRepeat++;

return iReturnValue;
}

The function is then referenced from a prompt function. Note that you may have to do something different if any of the prompts have the autopostback checked.

4) The Speech ListSelector control seems like a great way to allow users to select something from a list which is compiled at runtime. The ListSelector can be bound to an array. I have not tested this as yet in production but in development it seems to work just fine.

5) Making outbound calls requires that a message queue be set up. The TAS then has to be told to listen to the message queue. The message should then containt the URL of the speech app to be executed. It's up to the speech app to get the target telephone number and any other necessary info from the querystring of the URL.

6) I'm working with an Intel board, and in order to get outbound calling to work I had to configure the CAM to dedicate two of the four lines to outbound calling.

2 Comments:

Mehul Thakkar said...: I have an outbound calling application which works fine to direct phone numbers. But some of the customers have provided their number with extension. Do you know how do I dial extensions using Microsoft Speech Server? I have a SmexMessage control which gets activated once the call is connected and if extension number is available in the database. I create a CSTA message in OnClientBeforeSend function. In my log files with .etl extension, it looks like there is an infinite loop going on. Any idea???; 6:30 AM
Joe said...: I use the Page_Load event to set the SMEXMessage message property. Call ID and Device ID are call and device id from RunSpeech (I store these in session variables immediately after the call is made). The only issue is that I did find it necessary to use "Thread.Sleep(2000)" in the postback to give the phone system time to interpret the DTMF, and I had to experiment to get a good sleep period.

See GotSpeech.NET for more info...
http://gotspeech.net/forums/thread/375.aspx; 8:07 AM

<< Home

Curious Mind

Links

Monday, April 24, 2006

2 Comments: