|
YOUR FEEDBACK
Did you read today's front page stories & breaking news?
SYS-CON.TV |
TODAY'S TOP SOA & WEBSERVICES LINKS Product Review Introduction to SALT
Introduction to SALT
By: Hitesh Seth
Nov. 25, 2002 12:00 AM
Speech Application Language Tags (SALT) is a set of XML-based tags that can be added to existing Web-based applications, enhancing the user interface through interactive speech recognition. In addition, SALT can be used to extend Web-based applications to the telephony world, thereby providing an opportunity to unleash the potential of a huge user community, users of normal touch-tone telephones.
SALTforum, an organization founded by Microsoft, Cisco, SpeechWorks, Philips, Comverse, and Intel, has spearheaded development of the SALT specification, now in its 1.0 release.
Multimodality: Beyond Standalone Web and Speech Applications Multimodality means that we can utilize more than one mode of user interface with the application, something like our normal human communications with each other. For instance, consider an application that allows us to get driving directions - while it's typically easier to speak the start and destination addresses (or even better, shortcuts like "my home," "my office," "my doctor's office," based on my previously established profile), the turn-by-turn and overall directions are typically best viewed through a map and turn-by-turn directions as well, something similar to what we're used to seeing on MapQuest. In essence, a multimodal application, when executed on a desktop device, would be an application very similar to MapQuest but would allow the user to talk/listen to the system for parts of the application's input/output as well - for example, the starting and destination addresses. That's multimodal. Imagine the same application using the same interface on a wirelessly connected PDA. Now we're talking true mobile/multimodal application. If we let our imaginations go a little bit wilder, we can easily extend the same application to the dashboard of our car or any other device we can imagine working with. That's really the vision, which, given the current state of technology, isn't far away. Another modality that can be added to the example application would be a pointing device that would zoom the map, focusing on a particular location. So how does SALT fit in with all this? SALT has been built on the technology required to allow applications built using SALT to be deployed in a telephony and/or multimodal context.
SALT Application Model Figure 1 presents an abstract representation of application architecture for deploying and using SALT-based applications. As expected, it's very similar to that of a Web application, with two major differences. In this case the application is also capable of delivering dynamic speech interactions (if the appropriate browser is capable of handling SALT, e.g., through an add-in or natively), and a stack is present that represents a set of technologies broadly representing the integration of speech recognition/synthesis and telephony platforms. A note of caution: this diagram is really a conceptual representation. Where exactly the SALT browser/interpreter and speech recognition/synthesis components fit in depends on the capabilities of the end-user device/browser - actual implementation of the SALT stack may vary based on vendor implementations. The Advanced Speech Recognition (ASR) component focuses on recognizing spoken user utterances based on speech grammars, whereas the Text-to-Speech synthesis component is focused on dynamically converting text messages into voice output. When SALT applications are used in the telephony world, the telephony integration component connects the speech platform with the world of telephones, the Public Switched Telephony Network (PSTN). This is typically achieved by integrating telephony cards with the analog/digital telephony lines of your telephony provider (your phone company). When SALT applications are used to enhance the interactions of Web-based applications by adding multimodality to the application, a typical Web application delivery framework (based on TCP/IP/HTTP/HTML/JavaScript etc.) is used for delivering the Web application, and the speech/telephony platform is used for the "speech/voice" aspect of the whole interaction, depending on the nature of the connection and the location of the speech recognition/synthesis components. Both interactions can happen together seamlessly, as part of the same user session, on the user's choice.
SALT & VoiceXML Another difference between SALT and VoiceXML is the overall approach that has been utilized to develop applications. Whereas VoiceXML is pretty much declarative in nature, utilizing its extensive set of tags, SALT is very procedural and script oriented, having a very small set of core tags. Also, it's important to understand that SALT actually utilizes key components of the standardization effort carried at the W3C Voice Browser Activity, including the XML-based Grammar Specification and the XML-based Speech Synthesis Markup Language. Both these specifications have been used by the VoiceXML 2.0 specification as well.
Hello SALT
As you can see , SALT tags (
Going further with our exploration of SALT, let's look at how SALT applications provide speech recognition capability. The code shown in Listing 1 can be used as the basic template for an interactive order information system.
To understand the various components of this multimodal application, let's look at a snapshot of the sequence of actions performed when this application is launched within a SALT-compatible browser.
Elements of SALT
Microsoft .NET Speech SDK
It's important to understand that a SALT-based application can be delivered using a non-ASP.NET Web application framework (e.g., Perl or JavaServer Pages). What the .NET Speech SDK provides is ease of development in adding speech to your existing Web applications or creating new applications.
Conclusion
References XML JOURNAL LATEST STORIES . . .
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
|
SYS-CON FEATURED WHITEPAPERS MOST READ THIS WEEK BREAKING XML NEWS
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||