YOUR FEEDBACK
Chris Keene's Prescription for Curing the Java Flu
Pedro wrote: "Adobe and Microsoft are doing a far better job making their ...
SOA World Conference
Virtualization Conference
$200 Savings Expire May 16, 2008... – Register Today!


2007 West
GOLD SPONSORS:
Active Endpoints
Your SOA Needs BPEL for Orchestration
BEA
Virtualized SOA: Adaptive Infrastructure for Demanding Applications
Nexaweb
Overcoming Bandwidth Challenges with Nexaweb
TIBCO
What is Service Virtualization?
SILVER SPONSORS:
WSO2
Using Web Services Technologies and FOSS Solutions
Click For 2007 East
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TODAY'S TOP SOA & WEBSERVICES LINKS


Parasoft's Dr Adam Kolowa: "It's Time to Prevent Poorly-Written XML"
Establishing Rules and Team Policies Prevents Poorly-written XML Before it Happens

Digg This!

Page 1 of 2   next page »

Since its inception XML has at times been seen as the cure-all for every problem related to Web applications and integration projects. However, poorly written XML can either slow down an integration project, or worse, cause the integration project to collapse.

When developing integration systems such as Web services or any other business-to-business function, developers may encounter the following problems when writing XML:

  • Non-verifiable code - XML is supposed to be easily validated by use of Document Type Definitions (DTDs) or schemas. Frequently however, DTDs and schemas may be invalid themselves, too complicated for XML documents to reference, or even insufficient for most businesses. Therefore, there is no way to really guarantee that a certain XML file is valid if it does not reference a valid DTD or schema.
  • Human-readable, yet ambiguous code - Although human readability can be seen as an advantage of XML, it can also be viewed as a problem. Human-readable code isn't always readable by humans. For example, an element that has a specific meaning to one developer may be of no use, or make no sense, to another developer. Also, human-readable does not necessarily mean machine-readable. If XML code is written strictly for machine consumption, then there is no reason for having code that makes sense to humans but has no meaning to a machine.
  • Versioning problems - Maintaining multiple versions of a single document can be very difficult. Developers can either maintain a full version of the code to understand each XML format, or they can reference different DTDs for each format. Both options are possible but each requires a lot of time and effort.
  • Vogue attitude toward XML - Many developers turn to XML simply because it is the popular language of the moment and they do not consider whether or not it is the right solution. More often than not, XML introduces more complexities than needed where a simpler text file would have sufficed.
  • Chaos of standards - XML standards are still in development and are constantly shifting. Without any stability in XML standards, developers are forced to either keep up with the rapid changes, or fall behind.
Preventing the use of poorly written XML is more complicated than most developers realize. The key to successfully using XML in an integration project is first understanding the inefficiencies that may cause poorly written XML, and then applying a rule-based system that establishes policies that can be adhered to. This article will outline the many drawbacks of XML, and will address how a rule-based system can prevent the use of poorly written XML in integration projects.

Understanding XML
The Extensible Markup Language (XML) is a family of technologies that describe structured data. By using XML companies can create common information formats and share this information on the World Wide Web. For example, a company can create an XML document to exchange information about its products over the Internet. For a simple example of an XML document, see Listing 1.

XML and Its Inefficiencies
Although the example XML document in Listing 1 appears to be written correctly, how can developers be completely sure that the code is valid and well-formed, is comprehensible to other developers, and adheres to specific standards? The answer to this question lies in a rule-based system that can establish team policies and practices to prevent poorly written XML.

The following sections will outline some of the inefficiencies that can lead to problematic XML, and will address how a rule-based system can prevent the use of poorly written XML in integration projects. After all, system performance is only as good as the data received and the instructions given. If errors are contained in the XML, it is more likely than not that the system will crash.

Validating XML
One of the main benefits of XML is that it provides mechanisms for verifying document validity. There are two basic mechanisms for verifying document validity: DTD and XML Schema. For example, when creating an XML document developers can reference either of these mechanisms from within the document itself. The DTD or schema that is referenced will specify exactly how the XML document is to be processed, which elements and attributes are contained in the document, and the order in which these elements and attributes should be listed.

Defining DTDs
The following is an example of a simple DTD that can be referenced by an XML document:


<!-- ProductList DTD -->
<!ELEMENT ProductList (Product)*>
<!ELEMENT Product (#PCDATA)>
<!ATTLIST Product color
(red|green|yellow|weird) #REQUIRED
file CDATA #REQUIRED
id CDATA #REQUIRED
isFruit (true|false) 'true'>
To reference this DTD from an XML document, the following header can be added to the beginning of the XML document:

<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE ProductList PUBLIC "-
//OnlineGrocer//ProductList//EN" "ProductList.dtd">
A DTD is a specification based on the rules of the Standard Generalized Markup Language (SGML) and provides basic verification of XML documents. DTDs provide mechanisms for expressing which elements are allowed and what the composition of each element can be. Legal attributes can be defined per element type, and legal attribute values can be defined per attribute.

Defining Schemas
For an example of a simple schema that can be referenced by an XML document, see Listing 2. To reference this schema from an XML document, the attribute in the element can be specified with the following header:

<ProductList xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="ProductList.xsd">

An XML schema, like a DTD, defines a set of legal elements, attributes, and attribute values. However, XML schemas provide a more robust verification for XML documents. XML schemas are namespace-aware and also cover data types, data bounds, schema class inheritance, and context-sensitive data values - all of which are not covered by DTDs.

Lack of DTD/Schema Enforcement
While referencing DTDs or schemas can guarantee the validity of XML documents, there is no requirement that developers will use headers to reference DTDs or schemas at all. In fact, developers need only to follow simple syntax rules in order for an XML document to be "well-formed." However, a well-formed document is not necessarily a valid document. Without referencing either a DTD or a schema, there is no way to verify whether the XML document is valid or not. Therefore, measures must be taken to ensure that XML documents do, in fact, reference a DTD or schema.


Page 1 of 2   next page »

About Dr. Adam Kolawa
Adam Kolawa is the co-founder and CEO of Parasoft, leading provider of solutions and services that deliver quality as a continuous process throughout the SDLC. In 1983, he came to the United States from Poland to pursue his PhD. In 1987, he and a group of fellow graduate students founded Parasoft to create value-added products that could significantly improve the software development process. Adam's years of experience with various software development processes has resulted in his unique insight into the high-tech industry and the uncanny ability to successfully identify technology trends. As a result, he has orchestrated the development of numerous successful commercial software products to meet growing industry needs to improve software quality - often before the trends have been widely accepted. Adam has been granted 10 patents for the technologies behind these innovative products. Kolawa, co-author of Bulletproofing Web Applications (Hungry Minds 2001), has contributed to and written over 100 commentary pieces and technical articles for publications including The Wall Street Journal, Java Developer's Journal, SOA World Magazine, AJAXWorld Magazine; he has also authored numerous scientific papers on physics and parallel processing. His recent media engagements include CNN, CNBC, BBC, and NPR. Additionally he has presented on software quality, trends and development issues at various industry conferences. Kolawa holds a Ph.D. in theoretical physics from the California Institute of Technology. In 2001, Kolawa was awarded the Los Angeles Ernst & Young's Entrepreneur of the Year Award in the software category.

XML JOURNAL LATEST STORIES . . .
EDI to XML: A Practical Approach
While EDI transactions account for most worldwide commercial activity, XML-based alternatives are beginning to gain traction. According to Forrester Research, stateful XML, stateless XML, and even flat file exchanges are all projected to grow at a faster rate than EDI over the next few
3rd International Virtualization Conference & Expo: Themes & Topics
From Application Virtualization to Xen, a round-up of the virtualization themes & topics being discussed in NYC June 23-24, 2008 by the world-class speaker faculty at the 3rd International Virtualization Conference & Expo being held by SYS-CON Events in The Roosevelt Hotel, in midtown
Red Hat Named "Platinum Sponsor" of Virtualization Conference & Expo
Red Hat is a trusted open source provider. Red Hat offers enterprise customers a long-term plan for building infrastructures on the quality and innovation of open source. Combining open source operating system platform, Red Hat Enterprise Linux, together with applications, management
JustSystems Contributes Key XBRL Rendering Technology to Financial Community
JustSystems announced that it is contributing intellectual property rights for its invention of eXtensible Business Reporting Language (XBRL) rendering technologies to XBRL International, the standards body responsible for the oversight of the XBRL specification. The invention, known a
JustSystems Launches Campaign for XBRL Success
JustSystems announced its campaign to help organizations adopt XBRL (eXtensible Business Reporting Language), the XML-based standard for communicating financial and business information. In related news, JustSystems also announced that it has contributed intellectual property rights of
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON FEATURED WHITEPAPERS


ADS BY GOOGLE
BREAKING XML NEWS
SAP Accelerates the Path to SOA for Customers
has led to customer requests for training and education involving SAP's proven design and de