YOUR FEEDBACK
Werner Keil wrote: Java 6 update 10. If I'd be running Apple, I'd probably really drop dead...
AJAXWorld RIA Conference
$300 Savings Expire September 5th. Register Today and SAVE!


2008 East
DIAMOND SPONSOR:
Data Direct
Frontiers in Data Access: The Coming Wave in Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
Intel
Virtualization – Path to Predictive Enterprise
Green Hills
IT Security in a Hostile World
JBoss / freedom oss
Practical SOA Approach
GOLD SPONSORS:
Software AG
The Art & Science of SOA: How Governance Enables Adoption
PlateSpin
Effective Planning for Virtual Infrastructure Growth
Fujitsu
Automated Business Process Discovery & Virtualization Service
Ceedo
Workspace Virtualization
Click For 2007 West
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TODAY'S TOP SOA & WEBSERVICES LINKS


XML Middleware: XML and Messaging
XML Middleware: XML and Messaging

XML has been used in applications as a means of passing data between heterogeneous applications, to provide metainformation over content and maintain structure in data. Simply put, if HTML is the language to display information, XML is the language that can speak business terms or jargon.

In this article I'll discuss the fact that XML enables applications to organize and process information better on the enterprise. For this application I refer to Java technology for mail, servlets and messaging.

Messaging
Messaging is an integral part of most Internet-related applications because it handles data better for easier transaction management. It's used in the enterprise to exchange data between two heterogeneous applications.

For example, data comes into the enterprise through the Internet and then is sent to the back-end ERP application/system. The messages come via HTTP and have to be processed into specific formats for the appropriate back-end system.

Data is either queued and sent to one application listening at the other end of the queue, or broadcast to multiple applications wanting the same data. The messages from the queue are read by applications that process the message and translate its content for the ERP application.

Enterprise-wide applications have a messaging layer that sits between the Web server and the application servers and facilitates the delivery of appropriate messages to the appropriate servers. When a message is delivered in a queue, the queue acts as a buffer and holds the data until the server services the request. In addition, the messaging layer can have transaction management built in that would allow for the retention of messages in case the application server crashes, thus preventing the loss of critical data for the enterprise. It also prevents the handling of duplicate messages.

What Are These Messages and What's Their Format?
The following are messages that come into the enterprise in various structures or formats:

Flat Messages
In this format the data coming into the enterprise isn't structured. There's no relation between the various elements that form the content of this type of message.

For example, data submitted through a customer information form contains various heads such as the name, address, phone number, and so forth, but there's no hierarchy in this structure. Each element qualifies itself and doesn't depend on the other for its existence.

Hierarchical Messages
When it comes to data in the hierarchical format, there's a distinct relationship between the various elements that make up the data.

For example, an EDI message is made up of segment groups containing segments that in turn contain data elements and finally data. Each segment has to be within a segment group and the data element has to be with the segment. These could be mandatory or optional within their ancestor, but by necessity their existence is based on the existence of the ancestor.

Why XML? What Role Does It Play?
Platform Independence
XML is platform independent. It can be used as a medium to send data between heterogeneous applications without each application having to know about the proprietary format of the other. If I have two word processing systems that need to transfer content between each other, they could do so without knowing about the other's format since all they need to do is structure data as XML and send it across. Data could be qualified to be, for example, para or lesson:

<PARA> This is a para</PARA>
<QUOTE> The early bird gets the worm</QUOTE>

or more meaningfully based on the application.

<NAME>Abc</NAME>
<AILMENT>def</AILMENT>
<PRESCRIPTION> List of medicines</PRESCRIPTION> for a
doctor.

XML for Hierarchical Messages
Since XML is a structured language, it's a perfect fit for hierarchical types of messages; as data can be easily mapped to elements, the XML document, as a tree structure, takes care of the hierarchy maintenance. With an XML parser it's easier to extract difficult data from messages such as EDI because the parser does the job of isolating the data. It's easy to figure out how many times a particular element occurs as a child of which node in the tree. For example, in the case of EDI it could be represented as:

<SEGMENTGROUP id=1>
<SEGMENT mandatory="true">
<DATA ELEMENT>abc</DATA ELEMENT>
</SEGMENT>
<SEGMENT mandatory="false"></SEGMENT>
</SEGMENTGROUP>

Let's consider an EDI message (see Listing 1). When an EDI message is processed, it's required to maintain the EDI-specific tags as well as user-fed data.

EDI is divided into segment groups and segments, and each of the segments as well as the segment groups is mandatory or optional. Each segment can be repeated a number of times, indicated as 99 for 100 times and 999 for 1,000 times, which means 100 or 1,000 is the maximum.

XML plays a perfect foil to maintain EDI- specific information and hold data. One way in which an EDI message could be transformed into an XML document is to convert each of the EDI segment names to element tags. The properties of the segment, such as whether it's mandatory or optional, can be taken care of in the DTD; a loop counter can mention the number of times this occurs and be placed as attributes for the element. Identifiers can identify each of the segments. The EDI separators could be placed in the element attribute list as separator attributes for each element.

Although an EDI message could carry a DTD for ease of use, one could also use standalone, well-formed documents. Since the EDI structure itself doesn't have an overlapping hierarchy of segments, a standalone document is fine, plus it saves the time of validating against a DTD as well as keeping the EDI message flexible. Since we detach the message from a DTD, we could change the format as and when required. We just have to extract the right data and push into the back end. In using DTDs, there's the overhead of changing the DTD, then changing the document to reflect the change. Instead, keeping the document well formed removes the overhead of maintaining DTDs and the need to send two files to the client side in case the EDI message is generated at the client.

There's a well-formed EDI message in XML format in Listing 2. One could use a DTD, parse the XML files and keep them on the server as standalone XML files to be downloaded on request. The data could be populated in the elements and sent back to the enterprise where, since they're standalone, they could be used directly to extract the data and pump it into the back-end systems.

XML for Unstructured/
Flat Messages

In the case of a flat format, like an HTML form, the submitted data would have to be structured explicitly into XML, then processed. Sometimes this is overkill if the amount of data is small, because there's the overhead of structuring the message into a particular format, parsing the document, then extracting the message.

XML is an easy way to demarcate data, but should be applied by considering the amount of data and whether it has any structure.

One can have XML documents conform to a specific DTD, which means that you can easily demarcate what documents are flowing into your system. The demarcation of data can be done using the document type. Hence the data can be extracted and placed in the appropriate queues to be handled by the appropriate application server.

Figure 1 illustrates the architecture in which the data to the enterprise could be received either through HTTP or SMTP.

Using HTTP
Through HTTP the data could come in to the Web server, which would load a servlet to service the request, then generate a Java Messaging object based on the type of message and route it to the appropriate queue.

Using SMTP
The messages could also come to the enterprise through SMTP protocol in which case the mail server would receive the message. A mail-retrieval service could be written via JavaMail and be used to look up the mailbox on the mail server for mail received. The mail could contain headers that would indicate the presence of a particular type of message. The service could then extract the message that's sent as an attachment and, based on the header, push it to the appropriate queue to be handled.

The advantages of this approach are that when you can demarcate the messages, the following issues are resolved:

1. Since each message is handled by a particular queue, if one of the application servers goes down, the rest of the messages can still be processed. 2. You can invoke other application servers for load sharing in case there's an increase in a particular type of incoming message. 3. Log tracking and maintenance for each type of message becomes easier. 4. It frees up the application server from having to poll constantly for incoming data to process. 5. It implements a "push" model to account for which servers are sent data, thus saving critical CPU time for the servers.

Importance of Transactions in Messaging
Having transactions built into the messaging is an added advantage. Should the application server that's listening to the queue go down, the second-level application server, working as a hot standby, could connect to the same queue and start picking up messages that weren't handled or acknowledged. Moreover, you could decide on the type of service, whether it's critical or noncritical, based on the type of message.

Most messaging systems have built-in transaction processing so they can cache unacknowledged messages and send them back to the server once it connects to the queue. The SpiritWAVE implementation of Java Messaging, for example, has such support.

The hot standby will constantly ping the application server that's servicing the messages. When the server breaks down, the hot standby will connect to the queue and start servicing the messages.

Either the servers could handle the processing of messages, or the (Message) objects sent by the messaging server could encapsulate the logic to process the message data. In case the objects encapsulate the logic, all the message has to do is implement a known interface that the server will use to extract the data and connect to back-end systems. These back-end systems could be any of the following:

  • A database
  • A legacy application, in which case it would have to format the data in a specific format
  • A dump to a filing system
  • Another application waiting for the data

    Conclusion
    XML has opened newer avenues to processing data in a simple and straightforward manner. XML documents' inherent property of maintaining structure has become the backbone of most messaging systems and eased the need to demarcate data and process each type differently. XML is looked on as a major technology in future messaging systems.

  • XML JOURNAL LATEST STORIES . . .
    To be able to do anything useful, an ESB must be configured with all sorts of parameters, from endpoint connection URIs to message transformation scripts to content-based routing definitions. Moreover, ESBs like Mule can host custom components, which will process messages and perform u...
    Representatives of the state IT organizations of Brazil, South Africa and Venezuela, three of the four countries that protested ISO’s standardization of Microsoft’s Office Open XML (OOXML) file format, have apparently thrown in the towel on taking their appeal any further. India, t...
    Two of the biggest launches in Rich Internet Application history took place in 2007/2008 when Adobe launched AIR 1.0 in February '08 and Microsoft launched Silverlight (September '07). At the 6th International AJAXWorld RIA Conference & Expo in October SYS-CON Events is delighted to be...
    Red Hat CTO Brian Stevens, Citrix CTO Simon Crosby, Egenera CTO Pete Manca, Allen Stewart, Group Manager, Windows Virtualization at Microsoft, and Brian Duckering, Sr. Director of Products and Alliances at Symantec were the top industry executives who joined Jeremy Geelan in the 4th Fl...
    This article is aimed at beginner and intermediate Web developers looking to make the leap into database support of their Web site. The article suggests a new declarative language based on HTML-forms, which is used for development of the database interface. HTML forms can manage not on...
    SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
    SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
    Click to Add our RSS Feeds to the Service of Your Choice:
    Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
    myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
    Publish Your Article! Please send it to editorial(at)sys-con.com!

    Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


    SYS-CON FEATURED WHITEPAPERS


    ADS BY GOOGLE
    BREAKING XML NEWS

    Security Challenges for the Information Society