| By Dr. Srinivas Padmanabhuni, Akash Saurav Das | Article Rating: |
|
| September 27, 2006 07:00 PM EDT | Reads: |
20,457 |
Looking at the uncertainty and volatility of market conditions today, enterprises are depending on new cutting-edge technology to have an edge over their fierce competitors. At the same time, they try extracting more value from their existing IT investments. Adding to these disparate applications and technologies are the acquisitions and mergers that inherently bring in different sets of applications.
Better integration of these myriad applications built on different technologies clearly makes them more valuable. Using Service Oriented Architecture (SOA), enterprises can not only achieve better integration but also be future-ready as an agile enterprise that can swiftly respond to change in business processes.
XML and Its Role in SOA
XML is emerging as the lingua franca of data representation and exchange across applications interacting in an SOA world. A close look at the standards stack for SOA (Figure 1) shows that XML is the foundation for all the Web Services standards like XML Schema, SOAP, WSDL, and UDDI. These standards leverage the core concept of XML-based representations to carry out information interchange between service providers and requestors in a SOA.
Notwithstanding the core syntactic standards of SOA as shown in Figure 1, semantics is another important dimension that plays a crucial part in communication between a service provider and a service consumer in an SOA infrastructure and requires that the contents of the messages be mutually understood, which leads to the requirement of semantic interoperability.
XML solves the semantic interoperability problems associated with working with different data formats in different applications across multiple platforms. Different vertical business domain stakeholders have come to together and defined shared XML-based vocabularies to solve the semantic interoperability issue. (See http://xml.coverpages.org for a comprehensive list of such standardization efforts.) Using XML inherently brings ease of representation since it's text-based, flexible, and extensible. The platform- and language-independence of XML has catalyzed it as SOA's mainstream representation format.
SOA Performance Challenge and XML Compression Solutions
While self-describing XML-based service descriptions and messages in SOA make the data exchange easier, lending reusability and extensibility, they also increase the size of the data significantly. This is because the XML message typically contains not only the data as text, but also the format of the data. It contains all the information about the data presentation to the end user like font, size, and style. The verbosity of text-based representation by itself also tends to increase the data size in SOA payloads. So XML data representation not only increases data storage and data transfer times in SOA but also increases data parsing times in the context of a SOA, creating a performance challenge for a SOA.
The following are the salient points driving the need for compressing XML document in the context of SOA:
- Redundant data in XML documents, e.g., white space, similar node names.
- Text-based XML document sizes tend to be large.
- The need for an efficient way to store files based on XML.
- Large volumes of XML data sent over the network as SOAP payloads.
While the issues related to data storage and data transfer times can be resolved to a significant level by using compression techniques, the problem related to the processing overhead can be solved using both software and hardware solutions. A variety of tools and methodologies are already on the market to overcome XML processing limitations. Some prominent categories of tools and technologies that help overcome the limitations associated with using XML are briefly mentioned here:
XML Hardware
Large XML data processing will consume enormous amounts of CPU, memory, and network bandwidth. Traditionally there were processors that did general-purpose processing, but with the advent of XML and XML-based applications a new breed of custom acceleration processors are being developed. This specialized hardware, called XML accelerators, not only accelerate time-consuming tasks like XSL transformation and schema validation, but security-related features like encryption. These operate over networks and perform XML processing at wirespeed. XML accelerators are network devices that offload overtaxed servers by processing XML at a higher speed.
Compact Representation
A key premise in this approach is to use a compact representation to compact the size of the message being carried around. One mechanism is to have XML transferred in compact encodings like Abstract Syntax Notation. The usual textual format of XML offers no way to determine the end of a data value; hence the application has to examine every byte received. In this case the time consumed is increased and performance isn't that great. A different approach would be to represent XML in a binary format such as Abstract Syntax Notation number One (ASN1). This notation is associated with standardized encoding rules such as the Basic Encoding Rules (BER) and Packed Encoding Rules (PER) and is useful for applications that have bandwidth restrictions. This significantly reduces the time consumed and enhances performance.
XML Cache/Component Parsers
Repeatedly used XML data can be cached to reduce XML processing overheads. Similarly specific XML parsers can be used that cater to the specific needs of an SOA application.
XML Software Compression
Since XML is text-based, we can use gzip, bzip, etc. like techniques that leverage Lempel-Ziv and Huffman Encoding Algorithms for compression. These compression techniques are generic text compressors and they're effective and have very good compression ratios too. These techniques are good for sequential data, but unlike normal text, XML data is tree-structured data. XMILL from AT&T is a focused XML compression technique. It regroups similar XML nodes and uses conventional compressors such as gzip to compress the result of the regrouping of nodes. A comparison of the salient features of gzip and XMILL are as below:
gzip:
- available in both Open Source and commercial implementations
- Provides a good compression rate
- free from patented algorithms
- knowledge of the document structure isn't needed XMILL
- better compression rate compared to gzip (by a factor of two)
- it separates structure from content
- moderately faster than gzip
- three types of compressors available:
- atomic compressors for the basic data types
- Combined compressors
- User-defined compressors
New schemas are being developed to solve the problem of exchanging large documents between the service provider and the consumer. These schemas address the problem of fitting binary data directly into an XML message.
MTOM is a description of how XML-binary Optimized Packaging (XOP) is layered into a SOAP HTTP transport and uses XOP to let SOAP bindings speed up data transmission by selective encoding portions of the XML message. But MTOM uses a MIME package as opposed to XML and has the overhead of MIME processing to base-64 encoding.
Resource Representation SOAP Header Block (RRSHB) sends all the data needed to process the message. It can send a Web resource as a part of the SOAP message. This is specific to those cases where access to the resource is restricted to the body of the message and there is network overhead.
Conclusion
SOA infrastructure relies heavily on XML to be the lingua franca, and effective SOA performance management requires efficient ways of handling XML. XML compression techniques can go a long way in handling the SOA performance challenge. Needless to say, specific application needs are very decisive in choosing a compression technique from the myriad of techniques mentioned in this article.
References
- XMILL http://sourceforge.net/projects/xmill
- gzip www.gzip.org
- Datapower XML hardware www.datapower.com/products/xa35.html
- Sarvega Hardware www.sarvega.com/xml-security-products.html
- www.w3.org/TR/soap12-mtom/
- www.w3.org/TR/soap12-rep/
- www.w3.org/TR/soap12-mtom/#XOP
Published September 27, 2006 Reads 20,457
Copyright © 2006 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Dr. Srinivas Padmanabhuni
Dr. Srinivas Padmanabhuni is a principal researcher with the Web Services Centre of Excellence in SETLabs, Infosys Technologies, and specializes in Web Services, service-oriented architecture, and grid technologies alongside pursuing interests in Semantic Web, intelligent agents, and enterprise architecture. He has authored several papers in international conferences. Dr. Padmanabhuni holds a PhD degree in computing science from University of Alberta, Edmonton, Canada.
More Stories By Akash Saurav Das
The authors are interning and/or working as part of the Web Services COE (Center of Excellence) for Infosys Technologies, a global IT consulting firm, and have substantial experience in publishing papers, presenting papers at conferences, and defining standards for SOA and Web services. The Web Services COE specializes in SOA, Web services, and other related technologies.
![]() |
SYS-CON India News Desk 09/25/06 08:00:31 PM EDT | |||
Looking at the uncertainty and volatility of market conditions today, enterprises are depending on new cutting-edge technology to have an edge over their fierce competitors. At the same time, they try extracting more value from their existing IT investments. Adding to these disparate applications and technologies are the acquisitions and mergers that inherently bring in different sets of applications. |
||||
![]() |
SOA Web Services Journal News 07/24/06 04:19:53 PM EDT | |||
Looking at the uncertainty and volatility of market conditions today, enterprises are depending on new cutting-edge technology to have an edge over their fierce competitors. At the same time, they try extracting more value from their existing IT investments. Adding to these disparate applications and technologies are the acquisitions and mergers that inherently bring in different sets of applications. |
||||
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo New York: Delivering Digital Marketing on the Cloud
- AWS Going into a New Line of Work
- Session Topics: 12th Cloud Expo / Cloud Expo New York
- Five Big Data Features in SQL Server
- How Bon-Ton Stores Align Business Goals with IT Requirements
- Cloud Conversations: AWS EBS, Glacier and S3 Overview | Part 2 S3
- Amazon Cuts Prices on S3
- Cloud Conversations: AWS EBS, Glacier and S3 Overview | Part 3
- Compuware Signs New APM Partnership
- Google Submits Concessions to EC; Gets Sued in the UK
- Component Models in Java | Part 1
- Cloud People: A Who's Who of Cloud Computing
- Software Defined Networking – A Paradigm Shift
- Cloud Expo New York: Delivering Digital Marketing on the Cloud
- AWS Going into a New Line of Work
- Session Topics: 12th Cloud Expo / Cloud Expo New York
- Help Desk Solution Empowers Employees
- Five Steps Toward Achieving Better Compliance with Identity Analytics
- Five Big Data Features in SQL Server
- Development Testing for Java Applications
- Big Data Is Not Just About Marketing: Don’t Forget the IT Department’s Needs
- How Bon-Ton Stores Align Business Goals with IT Requirements
- A Cloud-Based Testing Tool for the Budget-Minded
- Where Are RIA Technologies Headed in 2008?
- Processing XML with C# and .NET
- AJAX World RIA Conference & Expo Kicks Off in New York City
- JSON vs XML - A Jason vs Freddie Sequel
- The Top 250 Players in the Cloud Computing Ecosystem
- Has the Technology Bounceback Begun?
- BPEL Processes and Human Workflow
- i-Technology Viewpoint: The Very Confused World of 3D and XML
- Generating XML from Relational Database Tables
- "HP's Problem Ain't the SAP Install," Says Sun's Schwartz
- Open Source Database Special Feature: An Introduction to Berkeley DB XML
- eXist - An Introduction To Open Source Native XML Database























