YOUR FEEDBACK
Werner Keil wrote: Java 6 update 10. If I'd be running Apple, I'd probably really drop dead...
AJAXWorld RIA Conference
$300 Savings Expire September 5th. Register Today and SAVE!


2008 East
DIAMOND SPONSOR:
Data Direct
Frontiers in Data Access: The Coming Wave in Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
Intel
Virtualization – Path to Predictive Enterprise
Green Hills
IT Security in a Hostile World
JBoss / freedom oss
Practical SOA Approach
GOLD SPONSORS:
Software AG
The Art & Science of SOA: How Governance Enables Adoption
PlateSpin
Effective Planning for Virtual Infrastructure Growth
Fujitsu
Automated Business Process Discovery & Virtualization Service
Ceedo
Workspace Virtualization
Click For 2007 West
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TODAY'S TOP SOA & WEBSERVICES LINKS


Parasoft's Dr Adam Kolowa: "It's Time to Prevent Poorly-Written XML"
Establishing Rules and Team Policies Prevents Poorly-written XML Before it Happens

Since its inception XML has at times been seen as the cure-all for every problem related to Web applications and integration projects. However, poorly written XML can either slow down an integration project, or worse, cause the integration project to collapse.

When developing integration systems such as Web services or any other business-to-business function, developers may encounter the following problems when writing XML:

  • Non-verifiable code - XML is supposed to be easily validated by use of Document Type Definitions (DTDs) or schemas. Frequently however, DTDs and schemas may be invalid themselves, too complicated for XML documents to reference, or even insufficient for most businesses. Therefore, there is no way to really guarantee that a certain XML file is valid if it does not reference a valid DTD or schema.
  • Human-readable, yet ambiguous code - Although human readability can be seen as an advantage of XML, it can also be viewed as a problem. Human-readable code isn't always readable by humans. For example, an element that has a specific meaning to one developer may be of no use, or make no sense, to another developer. Also, human-readable does not necessarily mean machine-readable. If XML code is written strictly for machine consumption, then there is no reason for having code that makes sense to humans but has no meaning to a machine.
  • Versioning problems - Maintaining multiple versions of a single document can be very difficult. Developers can either maintain a full version of the code to understand each XML format, or they can reference different DTDs for each format. Both options are possible but each requires a lot of time and effort.
  • Vogue attitude toward XML - Many developers turn to XML simply because it is the popular language of the moment and they do not consider whether or not it is the right solution. More often than not, XML introduces more complexities than needed where a simpler text file would have sufficed.
  • Chaos of standards - XML standards are still in development and are constantly shifting. Without any stability in XML standards, developers are forced to either keep up with the rapid changes, or fall behind.
Preventing the use of poorly written XML is more complicated than most developers realize. The key to successfully using XML in an integration project is first understanding the inefficiencies that may cause poorly written XML, and then applying a rule-based system that establishes policies that can be adhered to. This article will outline the many drawbacks of XML, and will address how a rule-based system can prevent the use of poorly written XML in integration projects.

Understanding XML
The Extensible Markup Language (XML) is a family of technologies that describe structured data. By using XML companies can create common information formats and share this information on the World Wide Web. For example, a company can create an XML document to exchange information about its products over the Internet. For a simple example of an XML document, see Listing 1.

XML and Its Inefficiencies
Although the example XML document in Listing 1 appears to be written correctly, how can developers be completely sure that the code is valid and well-formed, is comprehensible to other developers, and adheres to specific standards? The answer to this question lies in a rule-based system that can establish team policies and practices to prevent poorly written XML.

The following sections will outline some of the inefficiencies that can lead to problematic XML, and will address how a rule-based system can prevent the use of poorly written XML in integration projects. After all, system performance is only as good as the data received and the instructions given. If errors are contained in the XML, it is more likely than not that the system will crash.

Validating XML
One of the main benefits of XML is that it provides mechanisms for verifying document validity. There are two basic mechanisms for verifying document validity: DTD and XML Schema. For example, when creating an XML document developers can reference either of these mechanisms from within the document itself. The DTD or schema that is referenced will specify exactly how the XML document is to be processed, which elements and attributes are contained in the document, and the order in which these elements and attributes should be listed.

Defining DTDs
The following is an example of a simple DTD that can be referenced by an XML document:


<!-- ProductList DTD -->
<!ELEMENT ProductList (Product)*>
<!ELEMENT Product (#PCDATA)>
<!ATTLIST Product color
(red|green|yellow|weird) #REQUIRED
file CDATA #REQUIRED
id CDATA #REQUIRED
isFruit (true|false) 'true'>
To reference this DTD from an XML document, the following header can be added to the beginning of the XML document:

<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE ProductList PUBLIC "-
//OnlineGrocer//ProductList//EN" "ProductList.dtd">
A DTD is a specification based on the rules of the Standard Generalized Markup Language (SGML) and provides basic verification of XML documents. DTDs provide mechanisms for expressing which elements are allowed and what the composition of each element can be. Legal attributes can be defined per element type, and legal attribute values can be defined per attribute.

Defining Schemas
For an example of a simple schema that can be referenced by an XML document, see Listing 2. To reference this schema from an XML document, the attribute in the element can be specified with the following header:

<ProductList xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="ProductList.xsd">

An XML schema, like a DTD, defines a set of legal elements, attributes, and attribute values. However, XML schemas provide a more robust verification for XML documents. XML schemas are namespace-aware and also cover data types, data bounds, schema class inheritance, and context-sensitive data values - all of which are not covered by DTDs.

Lack of DTD/Schema Enforcement
While referencing DTDs or schemas can guarantee the validity of XML documents, there is no requirement that developers will use headers to reference DTDs or schemas at all. In fact, developers need only to follow simple syntax rules in order for an XML document to be "well-formed." However, a well-formed document is not necessarily a valid document. Without referencing either a DTD or a schema, there is no way to verify whether the XML document is valid or not. Therefore, measures must be taken to ensure that XML documents do, in fact, reference a DTD or schema.

About Dr. Adam Kolawa
Adam Kolawa is the co-founder and CEO of Parasoft, leading provider of solutions and services that deliver quality as a continuous process throughout the SDLC. In 1983, he came to the United States from Poland to pursue his PhD. In 1987, he and a group of fellow graduate students founded Parasoft to create value-added products that could significantly improve the software development process. Adam's years of experience with various software development processes has resulted in his unique insight into the high-tech industry and the uncanny ability to successfully identify technology trends. As a result, he has orchestrated the development of numerous successful commercial software products to meet growing industry needs to improve software quality - often before the trends have been widely accepted. Adam has been granted 10 patents for the technologies behind these innovative products. Kolawa, co-author of Bulletproofing Web Applications (Hungry Minds 2001), has contributed to and written over 100 commentary pieces and technical articles for publications including The Wall Street Journal, Java Developer's Journal, SOA World Magazine, AJAXWorld Magazine; he has also authored numerous scientific papers on physics and parallel processing. His recent media engagements include CNN, CNBC, BBC, and NPR. Additionally he has presented on software quality, trends and development issues at various industry conferences. Kolawa holds a Ph.D. in theoretical physics from the California Institute of Technology. In 2001, Kolawa was awarded the Los Angeles Ernst & Young's Entrepreneur of the Year Award in the software category.

YOUR FEEDBACK
AJAX news desk wrote: TIBCO's AJAX Technology has been honored with an InfoWorld 100 Award for its creative use of TIBCO General Interface Framework. Iconix Pharmaceuticals' leveraged TIBCO's AJAX solution to develop and power an intuitive, high productivity graphical user interface (GUI) for its comprehensive DrugMatrix database.
ajax news desk wrote: Iconix Pharmaceuticals Wins InfoWorld 100 Award for Innovative Use of TIBCO's AJAX Technology has been honored with an InfoWorld 100 Award for its creative use of TIBCO General Interface Framework. Iconix Pharmaceuticals' leveraged TIBCO's AJAX solution to develop and power an intuitive, high productivity graphical user interface (GUI) for its comprehensive DrugMatrix database.
XML JOURNAL LATEST STORIES . . .
To be able to do anything useful, an ESB must be configured with all sorts of parameters, from endpoint connection URIs to message transformation scripts to content-based routing definitions. Moreover, ESBs like Mule can host custom components, which will process messages and perform u...
Representatives of the state IT organizations of Brazil, South Africa and Venezuela, three of the four countries that protested ISO’s standardization of Microsoft’s Office Open XML (OOXML) file format, have apparently thrown in the towel on taking their appeal any further. India, t...
Two of the biggest launches in Rich Internet Application history took place in 2007/2008 when Adobe launched AIR 1.0 in February '08 and Microsoft launched Silverlight (September '07). At the 6th International AJAXWorld RIA Conference & Expo in October SYS-CON Events is delighted to be...
Red Hat CTO Brian Stevens, Citrix CTO Simon Crosby, Egenera CTO Pete Manca, Allen Stewart, Group Manager, Windows Virtualization at Microsoft, and Brian Duckering, Sr. Director of Products and Alliances at Symantec were the top industry executives who joined Jeremy Geelan in the 4th Fl...
This article is aimed at beginner and intermediate Web developers looking to make the leap into database support of their Web site. The article suggests a new declarative language based on HTML-forms, which is used for development of the database interface. HTML forms can manage not on...
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON FEATURED WHITEPAPERS


ADS BY GOOGLE
BREAKING XML NEWS

Security Challenges for the Information Society