YOUR FEEDBACK
cautionyou wrote: I agree with that the biggest change is the breadth of the projects that are hap...
Cloud Computing Conference
March 22-24, 2009, New York
Register Today and SAVE !..


2008 East
DIAMOND SPONSOR:
Data Direct
Frontiers in Data Access: The Coming Wave in Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
Intel
Virtualization – Path to Predictive Enterprise
Green Hills
IT Security in a Hostile World
JBoss / freedom oss
Practical SOA Approach
GOLD SPONSORS:
Software AG
The Art & Science of SOA: How Governance Enables Adoption
PlateSpin
Effective Planning for Virtual Infrastructure Growth
Fujitsu
Automated Business Process Discovery & Virtualization Service
Ceedo
Workspace Virtualization
Click For 2007 West
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TODAY'S TOP SOA & WEBSERVICES LINKS


Telecommunications Meets XML
Telecommunications Meets XML

Traditional telecommunications reside in the realm of time division multiplexed circuit switches. Million-dollar pieces of equipment are king, and their thrones demand room-sized real estate. Among their myriad physical requirements, these switches present an entirely different set of hurdles when telecommunications providers try to deploy services on top of them. You see, stubbornness seems to run in the family for TDM circuit switches, and writing software to utilize their services is a daunting task. It should come as no surprise then for me to tell you that programmers for these (somewhat) esoteric beasts are quite rare. Now, I've never written software for a TDM telephony switch before, but I've met people who have and listened to their stories. More important, I've looked into their eyes and seen their pain - no one wants to do what they did. As always with this type of situation, the price increases, reusability decreases, and flexibility seems to disappear.

Enter Voice-over-IP (VoIP), the art of putting voice chopped up into little segments and shoved into TCP/IP packets across standard data networks (see Figure 1).

Convergence, they call it. Why have two networks, telephone and data, when you can sneak by with just one? An entirely new subset of telecommunications has begun to emerge around this concept, now often referred to as next-generation telecommunications. Protocols like H.323, SIP, and MGCP have emerged to handle the growing set of technological problems VoIP introduces. There's a difference, however, in how the problems are being solved today versus how our predecessors approached them. VoIP proponents have the luxury of the Internet perspective. Open standards are hot. Protocols developed "the Internet way" are being adopted and put into use throughout the new telecommunications industry.

Putting voice traffic over data networks becomes extremely interesting when we reapproach the subject that circuit-based networks have so many problems with: services. There's always a lot of buzz around new technologies with the potential to reshape the world, and VoIP is certainly no exception. One common statement goes something like this: "When I pick up a telephone to make a call, I couldn't care less about which network it travels over. What I really care about are the services offered to me during that call." Indeed, it's a valid statement and often comes from the mouth of some old-school telecommunications executive in an attempt to discredit VoIP providers. Ironically, it's this reality the next-generation VoIP provider must embrace. The user's experience is impacted by the services, not the network - we accept this.

Enhanced services take advantage of the underlying data network to more effectively impact the user. You might be thinking to yourself that traditional telecommunications companies have to deploy new services, too. You're right, but remember, those circuit switches are a major pain in the butt, so deploying new services effectively, en masse, and with reduced cost is almost impossible. On top of that, in the circuit-switched world there's no data network to leverage in producing services.

VoIP providers use data networks, which give us all kinds of flexibility (see Figure 2). We can do things that traditional telecommunications companies can't. We can provide services to users over our data network from machines placed anywhere on that network. We can cluster. We can scale. We can use application servers to host any next-generation enhanced service. Part of the solution already exists in the protocols I described earlier, like SIP and MGCP. So, the landscape is painted. Much to the chagrin of old-school telecom, VoIP has the opportunity and means to do exactly what they say we must do to survive: provide enhanced services.

Of course, now that you know we can provide enhanced services, how exactly do you describe the details of one? There are all sorts of issues, ranging from call flow to voice interaction. It could be a true mess and, done improperly, suffer from the same problems that those nasty TDM switches present. Once again, however, next-generation telecommunications benefit from the Internet age - enter XML.

Everything that goes into a service for telecommunications users can be elegantly described using XML. A couple of companies have seen this and are positioning themselves to capitalize on the intersection of the need and the means of next-generation telecommunications providers. Open standards such as the XTML, VoiceXML, and CallXML have been defined and are being used in VoIP networks.

The rest of this article focuses on these new schemas, how they can be used, and their drawbacks. At the end, for those of you interested in getting your feet wet in the telecommunications industry, I'll provide a series of resources for more information.

VoiceXML
As defined by the VoiceXML Forum earlier this year, VoiceXML is an attempt to solve one of the most common problems faced when telecommunications services are created: voice interaction and response. Also known as interactive voice response, or IVR, this technology enables you to speak the word two as well as push the button on your phone when navigating some telephony application. Another perspective can be taken on VoiceXML and its purpose. It's truly meant to be a way to voice-enable Web content, in which IVR plays a crucial role.

VoiceXML has one significant drawback: it makes no solid attempt to define call flow. Call flow is just that - the way the interaction with the user on the other end of the telephone flows as the call progresses. VoiceXML simply defines the prompts, menus, and options that a user is presented with. The order in which they're presented can be toothpicked together but, in general, is not VoiceXML's problem.

Listing 1 shows the age-old tradition in full effect. "Hello, World" in VoiceXML might not appear very exciting, but a few elements are worth noting.

The <vxml> element is contained at the top level of the document. It has a version attribute to let the VoiceXML parsers know what to expect. The <vxml> tag is primarily used as a container for dialogs, which can be either forms or menus. Next, you see, as expected, the <form> element. In VoiceXML forms present information to the user and gather input. In our example you would hear "Hello, XML-Journal reader." Finally, the <block> element houses what we want to say to the user. Formally, <block> elements are containers of noninteractive executable code. True to form, the text representation of what we want to have spoken to the user is in our example.

Let's move on to a more interesting example. Suppose you have a friend who wants to let people call a phone number to hear jokes. The user will be presented with a menu and, depending on his or her selection, hear a random joke from that category. Listing 2 shows what the VoiceXML representation of this service is.

First, we start off with the familiar <vxml> tag. The next tag is something new. The <menu> element defines a dialog with categories that users can choose from. In this case we'll let users pick knock-knock, Aggie, or bad jokes. The next element is <prompt>, which holds data to be played to the user requesting input. Finally, the three possible choices are presented using the <choice> element, which is of particular interest in this example. Our joke hotline allows users to navigate and make choices by pressing keys on their telephones, producing DTMF tones. The <choice> element has two attributes in this example: dtmf and next. The dtmf attribute identifies the digit to expect for that choice. The next attribute signifies where to go when that input is received. As you can see, our example goes to a machine, then jokes, and then executes a program.

These two VoiceXML examples are simplified to demonstrate the basic structure of an IVR system. It's possible to define rudimentary flow to your IVR application using VoiceXML, but VoiceXML is not particularly suited to do so. There are many more elements that can go into a VoiceXML document, and as this number grows, the application becomes extremely complex and hard to visualize.

A more generic approach is necessary for defining call flow. CallXML from Voxeo Corporation defines a basic block/action/event interface model to represent the different paths a call might take.

CallXML
CallXML attempts to pick up where VoiceXML left off. CallXML introduces call flow and capitalizes on the advances made by VoiceXML. Voxeo has built this schema and is promoting it through its developer Web site, community.voxeo.com.

Revisiting the simple "Hello, World" example, Listing 3 shows how this might be implemented using CallXML.

The uppermost element in Listing 3 is <callxml>. Like VoiceXML, this is a container for the rest of the elements to come. If you're familiar with HTML, the <callxml> tag can be considered the same as the <html> tag. It serves only a semantic role. Next, a <block> defines a group of elements to be bound together. To facilitate call flow, <block> elements have a label attribute, which can be used from other CallXML elements to refer to particular blocks of CallXML.

Next, we encounter an element that illustrates an important difference between CallXML and VoiceXML. CallXML is event driven, meaning certain blocks of CallXML are executed based on the occurrence of different events. Here, we're telling the CallXML gateway to pick up the line and answer an incoming call. But what happens if an error occurs when the call is answered? The <answer/> element will generate an event, onError, that can be handled elsewhere in the CallXML. Finally, the <playAudio> element tells the gateway to play an audio file, found somewhere on the network, to the user.

I'd like to modify our old joke hotline so I can spread the joy to friends who are feeling down. I'd want to call the joke line, enter my friends' phone numbers, and have the joke hotline call them to tell a joke. Listing 4 illustrates how this can be accomplished using CallXML.

The CallXML begins as you might expect with <callxml> and <block> elements. Then we see an <answer> element telling the gateway to pick up the line. Because I want to play the joke for my friend, I need to prompt the user for some input, namely, a <playAudio> tag. Next, <getDigits> is used to retrieve input from the user. This element is not one that we've encountered in the past. The two attributes used within the element are var and maxDigits. The var attribute specifies the name of the variable to store the input in. This can be referenced later by using the syntax "$varname;". Next, maxDigits defines the maximum number of input digits to retrieve from the user. When the maximum number is hit, an event is generated that can be handled to continue the call flow. This is exactly what's happening with the element, <onMaxDigits>. Inside this element we can make an outbound call to the phone number entered with <call>, then, when my friend picks up the line, play the joke of the day.

This example only touches on the features provided by the CallXML specification. CallXML gives us total call control capability, leverages the power of the Web, and is extremely easy to use.

The eXtensible Telephony Markup Language - XTML
XTML is perhaps the most feature-rich, end-to-end complete, and practical of the schemas presented. XTML, as defined by the folks at Pactolus Communications Software, takes a modular approach to the problem. Call flow and IVR are two of its features. Common extensions provide the flexibility to do extremely useful things, such as drop to Java or C++ and make database calls. These things must happen during the execution of a telephony application. For example, after a user enters his or her PIN for a prepaid phone service, it must be authenticated against a database. XTML allows for this interaction to be embedded directly into the call flow.

Next-generation telecommunications is a dynamic industry. New protocols are always being proposed, different hardware is installed, and new ideas are discussed. The creators of XTML recognized this dynamic and built the schema around a core set of functionality that can be extended as the requirements of the problem change. The previous schemas, VoiceXML and CallXML, lack this flexibility. This does, however, increase the complexity of the language slightly, but can quickly be overcome with usage.

Let's define a simple "Hello, World" telephony application using XTML. Before we get to the example, let me describe the basic structure of an XTML document. Like CallXML, XTML is based around an event model. Events that are interesting to a particular XTML document are identified, then bits of XTML within that document called functions are created to handle those events. Functions can accept parameters and return values as you might expect, and follow a sequence of actions. These actions are defined and results are mapped to further actions. It's really a very elegant and powerful way to describe a call flow. Now back to the example. Listing 5 shows what the XTML for our "Hello, World" looks like.

Whoa! I can hear you screaming "Slow down already!" XTML's version of "Hello, World" is much more complex and quite a bit more obscure. As I said, this is due to the extreme flexibility and extensibility of XTML. Let's take it step by step and I'm sure it will become clearer to you.

The <xtml> tag declares the XTML document's root element. Remember when I said that XTML is made up of core functionality with extensions? Well, the <xtml> tag is where we identify what schema defines this core functionality. This is accomplished through the xmlns attribute. Next, we declare the events we're interested in and identify the functions responsible for handling them. The <event> element specifies the event name, represented by the name attribute and the handler associated with it. It's important to note the naming convention for the event: Vendor.Event.Version. This is followed throughout XTML.

Let me stop here for a moment to point out the way XTML can be extended. Anyone can implement and deploy new events to the application server and then reference these new events in their XTML. This model occurs within different elements of XTML. For example, there's a standard set of actions, and developing new actions and referencing them from XTML can expand this set.

Now, back to the example. After declaring the events that are interesting to us, we define a set of reusable functions. The <functions> element declares the start of this block of XTML and then each function is defined using a <function> element. In our example there are three attributes of interest to a function. First, we give it a label using the name attribute, then we identify the first action in this function's sequence of execution using the start attribute. Later, we'll define the actions that make up this function, and one of them must have an ID that matches the function's start attribute. A return type is then identified with the returns attribute. The function definition continues by declaring parameters and local variables. This is very similar to the semantics of functions in regular programming languages.

Finally, the <actions> element signifies the beginning of the action declarations. Each action has an ID and plug-in attribute. Of particular interest here is the plug-in. As I said before, XTML is flexible, and this is another area where the functionality can be extended. Third parties can develop new plug-ins that can be referenced from XTML documents. The first action uses an <announcements> element, which is contained in a separate schema, notably one that defines call extensions to XTML, to play the audio for our "Hello, World" example. Each action has a series of results that specify what actions should follow. The final two actions will terminate the call via a hang-up, then end the session with the application server.

Visual Tools
The most exciting advancement that can be derived from using XML to define enhanced services is how the definition is accomplished. The inherent nature of XML makes it suitable for visual modeling. Tools already exist for generic XML authoring, but what I'm talking about here are tools to visualize call flow, define interaction, and integrate logic. With tools such as Pactolus' Service Creation Environment (see Figure 3) or Voxeo's Designer, users can easily define new telephony applications and deploy them to the network. No longer are hundreds of lines of code, hours of debugging, and huge amounts of man-hours necessary. Internet technologies have enabled telephony application developers to move at Internet speeds.

Other Standards and Initiatives
Other telephony XML schemas and DTDs have been used to define certain aspects of a telephony application. The Call Processing Language, or CPL, can be used to define rules on how to handle telephone calls. WTE is a Microsoft initiative to define a series of extensions to HTML for voice-enabling Web content.

A large part of the problem is that within several months multiple ways of defining enhanced services have begun to pop up. XTML, CallXML, and VoiceXML each have their own high and low points. VoiceXML is great for defining IVR interaction but lacks real call control characteristics - you can't make an outbound call, for example. CallXML is very straightforward and easy to use, but it lacks real flexibility and extendibility. XTML is arguably the most powerful of all the approaches, but it's a bit more complex and hard to grasp. Unfortunately, no standards bodies are overseeing the development of these efforts.

I'm tempted to draw a comparison with the Java Application Server and J2EE situation. Sun found it necessary to have a common way to write enterprise applications but knew wide-scale acceptance would come from implementation options. They defined the interfaces and left it to the application server vendors to provide the service. The same approach might work well in the enhanced-services arena. Perhaps a standards body made up of a cross-section of industry leaders could define a comprehensive XML schema for enhanced-service definition. It would then be left to the Voxeos and Pactoluses of the world to implement the standard and provide an application server. Conclusion and Acknowledgments
Communications convergence is going to happen - it's inevitable. Economics will guarantee that the change takes place and Voice-over-IP will be one catalyst in that transition. Whether VoIP providers can ever reach the status of MCI or AT&T is another question. One driving factor for VoIP carriers to do so will be the types of differentiating services they provide to the user who makes a phone call on their network. Tools like those provided by Pactolus, powered by XML, will play a major role in how well VoIP carriers can impact the user and how quickly and reliably they can do so.

If you'd like to discover what else XML brings to the telecommunications industry, I refer you to the specifications for VoiceXML (www.voicexml.org), CallXML (community.voxeo.com), and XTML (www.pactolus.com). Moreover, I urge you to e-mail me at lmarascio@pointone.com with any questions or feedback you might have. Finally, I'd like to thank Jasson Casey, Dave Horton, and Robby Slaughter for reading and providing feedback for improvements on the drafts of this article.

XML JOURNAL LATEST STORIES . . .
A round-up of the many themes and topics of interest to infrastructure architects, developers and IT managers featuring at SYS-CON's Cloud Computing Expo being held November 19-21, 2008 at The Fairmont Hotel in San Jose, California. The conference is expecting a record turnout of senio...
SYS-CON Events announced today that the leading global SOA, Virtualization, Cloud Computing and Open Source technology provider FreedomOSS named "Gold Sponsor" of SYS-CON's SOA World Conference & Expo which will take place November 19-21, 2008, at the Fairmont Hotel in the heart of Sil...
Cloud Computing offers significant benefits over traditional solutions for deploying production systems as well as for conducting development and testing activities. This session will distill the unique characteristics of clouds and describe how to best think about deployments in the c...
Intel has just released Intel XML Software Suite 1.2. This latest release helps maximize XML performance, while minimizing the effort for any Enterprise, SOA, SaaS, and Web 2.0 based applications. Intel XML Software Suite 1.2 optimizes XML application performance, takes full advantage ...
SYS-CON Events announced today that the leading global SOA, Virtualization, Cloud Computing and Open Source technology provider Intel named "Gold Sponsor" of SYS-CON's SOA World Conference & Expo which will take place November 19-21, 2008, at the Fairmont Hotel in the heart of Silicon ...
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON FEATURED WHITEPAPERS


ADS BY GOOGLE