YOUR FEEDBACK
shirley wrote: As an ISV and service provider, we specialise in .NET based collaboration soluti...
Cloud Computing Conference
March 22-24, 2009, New York
Register Today and SAVE !..


2008 East
DIAMOND SPONSOR:
Data Direct
Frontiers in Data Access: The Coming Wave in Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
Intel
Virtualization – Path to Predictive Enterprise
Green Hills
IT Security in a Hostile World
JBoss / freedom oss
Practical SOA Approach
GOLD SPONSORS:
Software AG
The Art & Science of SOA: How Governance Enables Adoption
PlateSpin
Effective Planning for Virtual Infrastructure Growth
Fujitsu
Automated Business Process Discovery & Virtualization Service
Ceedo
Workspace Virtualization
Click For 2007 West
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TODAY'S TOP SOA & WEBSERVICES LINKS


Development Tools for XML Applications
Development Tools for XML Applications

XML is object without source. There can be no development tools for XML until we find a way of creating source code.

"There's no such thing as an XML application." A strong statement, perhaps, but what do we mean when we talk about an XML application? Is a publishing system that relies on XML to do its work an XML application? Can we apply the term to a B2B marketplace where all the processes in a transaction can be defined by DTDs and all data flowing around the system is in XML? Is a content syndication system an XML application? Or a foreign exchange business line in an investment bank?

Typically, all of the above use XML extensively. But what do they have in common? And how do we "develop" these systems? Do we "develop" the XML parts, or do they fall out of the development of other parts? After all, XML isn't a programming language. XML applications are built in C++, Java, or Visual Basic (for example).

The Common Ingredient
The common ingredient in all XML applications is, of course, the XML. The data flowing around in the system is in an XML format. As such, I would argue that people don't currently develop XML applications. They develop software systems that use XML to achieve certain aims.

For example, XML is the glue in EAI (enterprise application integration), gluing together pieces of software that probably have nothing to do with XML. Publishing systems use XML to provide single-source to multiple-output formats, but the systems themselves aren't "developed" in XML, nor do they care very much how the XML has been defined. Publishing systems are usually built using a combination of technologies that can store, communicate, and manipulate XML, but that aren't intrinsically XML technologies.

Why Do We Need XML Application Development Tools?
Why do we care about XML application development? We care because putting an industrial-strength XML-based application together is complex and error-prone. The veterans of this industry will say that they do everything in vi or Notepad - and the implication is that development tools don't add much value, and perhaps can't even be trusted.

Explosion of Complexity
I disagree. XML allows you to develop sophisticated systems very rapidly, even if you do it all in vi (presuming you're working with a clean sheet). But once the system starts to take shape, you need help in the form of powerful software to build the XML pieces. This is because when you choose to use XML in your application, you're effectively taking single objects that are important to your organization, such as a Transaction ID, and referring to them in dozens, hundreds, or even thousands of places by the time you've put the system together (see Figure 1).

This is denormalization to a dangerous degree. The Transaction ID will find its way into schema fragments, be passed to Java classes, be described in stylesheets, be transformed by XSLT, and so on. Clearly, unless you get it right the first time (and I assure you that never happens), you need development tools to help manage the complexity.

Adding Value
Let's examine the value that XML development tools might add. Is it time-to-market? Certainly. If a software program can fill in the missing pieces as you do your work, if a tool can catch errors, or if a tool can store pieces of work for reuse, you save time. Compared with doing everything in vi or Notepad, using a development tool is like using a word processor instead of a typewriter.

Another value-add is the ability to present a visual representation of something that's obscure in a clearer context, such as a file full of XSLT declarations. If you can present a source schema on one side of the screen, a result schema on the other, and in the middle display all the links relating source elements and attributes to result elements and attributes, you have a powerful representation of what the resulting XSLT will achieve for you. (You also don't need to learn XSLT to an expert level before attempting some fairly complicated transformations.)

Subliminal Education
Such tools can be educational, which is another great way of adding value. Having mapped definitions from one side to another, a good system should be able to generate the resulting XSLT and show it to you in a window, which will rapidly and subliminally train an attentive user in the use of XSLT. Compared with doing everything in vi or Notepad, using a development tool to conceal and enable complexity is like using a CAD/CAM drawing package instead of pen and ink.

Surely these are all very good reasons for using XML development tools.

What Should a Development Environment for XML Contain?
A development environment for XML allows you to develop XML. This sounds logical, even simplistic, but ask yourself: "What is the XML that we're trying to develop?" After all, XML comprises many things, not all of them officially sanctioned by the W3C. XML relies on a large range of non-XML technologies in order to function (think of CSS, for example). Where does the true XML flavor start and the non-XML flavor stop?

Essential Ingredients
To "develop" XML, you need to develop or use any of the following:

  • XML document instances
  • Schemas, such as DTDs, Schematron, or XML Schemas (XSD)
  • XSLT for transformations
  • XSL-FO for prepress applications
  • XLink and XPath expressions
  • Various queries using any of a range of querying languages or grammars
  • CSS stylesheets
  • Scripts (Perl, Python, Omnimark, etc.)
  • And so on...

    Essential Tools
    The "tools" that will help you develop or work with the above list are all essentially graphical user interfaces around the files and fragments that come out of the end of the development effort, and include:

  • Schema editor
  • XSLT editor
  • Stylesheet editor
  • Script development tools
  • And so on...

    Essential Infrastructure
    The infrastructure that needs to exist before any of these can be worked with includes:

  • Parser
  • XSLT transformation engine
  • Browser
  • XML (document) editor
  • And so on...

    Essential Standards
    The development environment must enable applications to work with accepted standards, such as DOM3 and SAX.

    'Nice-to-Haves'
    The "nice-to-haves" in this picture might include:

  • Content management system
  • System for data-binding your XML into Java classes
  • Web application server
  • Web services development tools
  • And so on...

    We're rapidly arriving at an extremely complex picture of the requirements for an XML development tool, yet the picture is murky. Is it actually possible to combine all of the above into one tool? And if so, would anybody buy it or use it? What are we going to develop with these tools?

    IDE
    Let's look at the tools currently on the market. The term IDE (integrated development environment) is used by many manufacturers in the XML community these days. Products such as XML Spy from Altova and TurboXML from TIBCO/Extensibility spring to mind. The term environment represents a collection of tools in this context.

    We've been conditioned by such products to interpret XML development as a means of creating DTDs, documents, stylesheets, and transformations. Yet when I create a DTD, then write a document that validates against that DTD, then store that document in a content management system, have I developed XML? Arguably, yes. But to what end? The XML does nothing without an infrastructure to use, publish, process, and communicate it.

    Furthermore, as soon as I want to use an element in various places and for various purposes, I have a problem. Powerful software development environments provide the functionality to support true object reuse, inheritance, versioning, backups, and a permission system. The XML IDEs on the market at the moment don't.

    If I create an XML-driven system that relies on a Transaction ID element (as described earlier), an XML IDE will help me put it into dozens of schema fragments, XSLT files, stylesheets, and so on. But the IDE won't manage the fact that the Transaction ID is actually one object that has been used in all these different places.

    This is very disturbing from a software developer's perspective, especially if you're used to an object-oriented development environment and accustomed to the notions of, for example, single source and extension of object classes.

    XML as a 'Language' Is Weird
    The adoption of XML in IT is now unilateral, from finance to pharmaceutical, aerospace to defense. XML is mainstream. Nobody can afford downtime, and systems must support the business process flawlessly. In the pharmaceutical sector the legal repercussions of a mistake due to a problem in the publishing process could mean lawsuits totaling many millions of dollars. In the stock markets two minutes of downtime could represent millions of dollars in lost opportunity for an investment bank.

    No Source Code
    If we're going to work with XML in these environments, we should treat it with as much respect as all the other parts of the system. We should store the source code in a source control system, and apply versioning, branching, permissions, automated testing functions, and so on. But (I hear you say) XML doesn't lend itself to such management. There's no source code!

    Compared with third-generation programming languages (3GL) such as C, COBOL, or Fortran, XML is very strange. Of course, comparing XML with 3GL isn't entirely fair. XML isn't a language in the traditional sense of programming languages. XML lets you use words that describe your business and apply constraints to those words. XML has no source code in the way that C++ or Java can be said to have source code, so we can't store and manage source code for XML. There are no XML compilers and development studios that provide the kind of functionality that, say, a Java developer would expect from an Enterprise version of Borland's JBuilder product.

    However, it's this very inability to treat XML as we treat traditional programming languages that gives us so many management challenges in mission-critical environments. XML needs source, it needs powerful development environments and compiler-type technology, and it needs infrastructure to manage the development environment.

    XML Is Impossible to Model
    In complex IT environments, models are important. Models allow us to conceptualize the components in a system and generate the pieces that we need to develop software systems. For example, from a CASE (Computer Aided Software Engineering) tool or UML (Unified Modeling Language) modeling tool, I can generate Create Table scripts and stored procedures that build my database environment for me. Models simplify the work of application integrators and allow people who care about business processes to relate the functioning of the organization to the IT infrastructure supporting the organization.

    XML Is a Runtime Expression of Something Else
    As seen in Figure 2, when two processes exchange information in XML:

    1. The XML represents a small part of a process model.
    2. It's probably a subset of a database schema.
    3. And it probably contains temporary values generated for the purpose of the current process but not likely to be stored.
    The XML is structured according to a format that has been agreed on by the programmers or the designer or the supplier of some of the software involved (for example, CommerceOne). The model describing this exchange of data might be found in UML or a CASE tool, or be derived from a group of functions acting on data that can be described by a database schema. The model for the XML document passing between the two processes doesn't exist; it doesn't need to because well-formed XML obeys a set of rules that render the model redundant.

    Serious Problems
    At least, that's the idea. That is also the reason so many organizations are beginning to run into serious problems with XML development. The following problems arise for the developer:

  • Implicit validation means there's no means of enforcing the contract between P1 and P2 other than to return an error and fail when things go wrong.
  • System documentation should exist to describe the contract between the two processes (and a formal syntax is preferred when describing XML).
  • There's no "model" of the communication, therefore no way of understanding in a syntax-free way the XML part of the process between P1 and P2.
  • Without a formal description of the communication, when one of the elements in the XML document is affected by a change elsewhere in the system, you have no way of knowing that the process between P1 and P2 is probably broken.

    A Model of the Document Is Not a Solution
    To solve the foregoing problems, we need to model the document in a schema or DTD. The "model" of an XML document is its schema (see Figure 3).

    However, having a schema simply isn't sufficient. Think back to the Transaction ID object mentioned earlier. I care about transaction IDs because I'm building a software system for a commercial application - an insurance claims system, say, or a foreign exchange business line application for an investment bank. The Transaction ID isn't something that starts life in XML - it's probably there because of gigabytes of legacy data, existing software applications, and a UML model on somebody's desktop. It turns into XML because I'm using XML to glue all the different pieces of my system together with XML. And my transaction ID travels around and touches every system, every program, every database....

    My company acquires another company and we start to integrate our IT infrastructures. My transaction ID is an integer. The other company's transaction ID is an integer plus a timestamp. I've built my system using XML; my transaction ID occurs in thousands of different places and I have no way of knowing where. I don't have a model of my XML. The upside of XML when I built the system was the phenomenally short time-to-market for a highly sophisticated system. The downside is that the result is difficult to maintain and there are inherent risks.

    A Document-centric Example
    Imagine the publishing system at a pharmaceutical company. My project scope is the packet of tablets that a pharmacist sells to the public. I have four documents: a label on the drugs, a label on the box, a sheet of information inside the box for the customer, and a sheet of information for the doctor who prescribes the drugs. I have 300 products. I sell in 130 countries. I need to publish in at least 30 languages. I have to deal with the local legislation and cultural requirements of each country. XML offers me the chance to build the perfect solution (and nothing else can).

    One of my attributes is a piece of text describing a certain warning in a certain set of circumstances for pregnant women at a certain stage in their pregnancy. By the time I've rolled out my product in 130 countries and 30 languages, a complex authoring and publishing environment will have been built to process the attribute, and the attribute itself will have been used or referred to in potentially millions of places. There's no way of knowing where, or what the impact will be if a related property needs to change. Not only is this dangerous (if an error in my publishing system causes a life to be lost, is this XML-induced death?), but you can't forecast budgets for unpredictable outgoings.

    How can I manage the complexity of such a system? This is where a 3GL-like XML development environment could significantly reduce the risks and costs involved.

    Creating Source Code for XML Environments
    Let's examine what it would take to create a 3GL-like development environment for XML. First, we need source code. If a schema is neither the source code, nor the model, for an XML document, what else can play this role? The answer is that we need a way of modeling the model. In other words, we should create an object model that can store enough information about a structure to be able to generate the pieces needed in the XML-based application. It should be possible to generate schemas, stylesheets, and the like from an object model.

    Enabling Object Reuse and Inheritance
    An extra dimension is required for true object reuse and inheritance (see Figure 4). For example, to treat a transaction ID as one object that has been used in multiple places, we need a model of the model of the model of the document. That is, from an object model we need to be able to assemble structures (models of schemas) from which we can generate schemas and other property files, which describe the validation rules and process requirements of a document.

    The structures we assemble are like contexts in which an object is used. In context one, A contains B followed by C, which contains D followed by E. In context two G contains A, which contains H followed by B. If I know that A in any particular context is actually a reused object from a model where it occurs as one and only one object, I can apply true object reuse and inheritance.

    The next hurdle is versioning of objects. This is where the underlying technology becomes very important. In any sophisticated source control environment, developers can check out an object, work on it, and check it back in again. This activity is controlled by a versioning mechanism that allows updates to happen on a branch of the parent code line. Full permissions and transaction support needs to be applied. Multiple variants of a code line are possible. Published versions are identified by a time line that essentially picks up the relevant pieces and freezes them into a state that cannot be violated.

    The only technology that can support this level of control over source code is repository technology. In other words, everything about the model of the model of the model of the document needs to be captured and managed in a database, which requires a very sophisticated approach to metadata. Get this piece of the puzzle right and XML development will take a quantum leap forward! To my (admittedly biased) mind, only one product anywhere in the world is capable of providing such support - CorteXML from Barbadosoft, which was designed to enable high-speed change in complex XML-based environments.

    Conclusion
    Currently, XML development environments do not provide the level of developer support customary in a 3GL programming environment. The question we need to answer is, Do we need 3GL-like XML development environments? For simple systems, probably not. But for complex systems I would argue that we cannot do without them.

    About Jim Gabriel
    Jim Gabriel has authored tens of thousands of pages of technical documentation, ranging from entry-level tutorial material to programmers' reference manuals. He is literate in XML, SGML, and XSL, among others.

  • YOUR FEEDBACK
    Jim Gabriel (author of the article) wrote: I am prompted to contribute to this thread by the words "uninformed" and "badly researched". This article doesn't compare existing XML IDE products, nor does it set out to undermine the achievements of the existing tools vendors. Rather, it focuses on the issues that arise when managing the development and maintenance of complex systems that use XML. Having researched this subject and the XML tools market exhaustively since early 1999 (I was an early victim of unmanageable complexity), my conclusion is that the most serious issues (i.e. the ones that cost us time and money) are due to XML being only a part of the larger picture in an IT system -- and unfortunately, it's the part for which we don't have 'source' code or powerful configuration management systems. I deliberately avoided doing a feature-by-feature comparison of the (largely excellent) XML IDE products for that reason. Th...
    John Williams wrote: It's nice to read an article which recognizes that doing stuff with XML involves more than just XML. Sure, you can do all kinds of cool things with XML, but you need parsers, processors and data extraction code before you can even start and I've never seen any of those written in XML. The quality of tools for working with XML has improved considerably over the last year or so but they still have a way to go. To answer Mr Falk's response, I will certainly try out XML Spy 4.3, but in all fairness to the author it was only announced today, so it seems a bit harsh to label the article "badly researched".
    Alexander Falk wrote: I don't mind the shameless plug for CorteXML, but the article is badly researched. XML Spy 4.3 provides ALL the features mentioned in the author's wishlist and more!
    John Garvey wrote: No shameless plug intended, however, my company has and is developing what we believe are almost "pure" XML applications. Our core technology is an integrated browser-accessible, XML server that maps to static database tables of relational databases, DTD editor, document editor, XSL editor, with a rich set of APIs that use XPath notation for storing and accessing data using dynamic document construction based on URL calls for content. We have build an end-to-end clinical data management platform for the pharmaceutical industry, and using our API toolkit are rapidly developing completely XML-centric applications with a factor of 10 efficiency over traditional programming methods. ALL data is stored in a relational database, and well-formed, type-validated XML is delivered to the browser, using sophisticated stylesheets. We can work with any and all XML "dialects" at the same time, and...
    Norm Samuelson wrote: Your article says there are no XML applications. I think I have an application that might qualify. At Lawrence Livermore National Lab, we have a GUI (named Cyclops) that is written in Java, and uses XML not just for it's primary input and output, but also in associated "Parameter Description" files. Cyclops is a GUI for a growing bunch of large physics applications. Each has MANY parameters, and building a GUI for any one is a big job. Cyclops takes them all on thru its flexible structure which is based on XML. The interface is built from the parameter description file(s) chosen by the user when he runs Cyclops. It is even possible to have one XML file that describes a physics problem to be worked on by different applications, for comparison.
    Kevin Brown wrote: There are more tools than just those from BarbadoSoft. Lightspeed Interactive, which purchased the code for the Astoria Content Management System from Xerox, has such a product. This software has been managing complex change in document-oriented environments for many years with over 150 Fortune 1000 company installs. It was designed and implemented in the days of SGML and has a complete web services interaction layer for complex interaction and development in today's more transaction-oriented environment. Lightspeed offers this software in our iENGINE applications. Barbadosoft is in fact a partner of ours in development for certain applications.
    XML JOURNAL LATEST STORIES . . .
    A round-up of the many themes and topics of interest to infrastructure architects, developers and IT managers featuring at SYS-CON's Cloud Computing Expo being held November 19-21, 2008 at The Fairmont Hotel in San Jose, California. The conference is expecting a record turnout of senio...
    SYS-CON Events announced today that the leading global SOA, Virtualization, Cloud Computing and Open Source technology provider FreedomOSS named "Gold Sponsor" of SYS-CON's SOA World Conference & Expo which will take place November 19-21, 2008, at the Fairmont Hotel in the heart of Sil...
    Cloud Computing offers significant benefits over traditional solutions for deploying production systems as well as for conducting development and testing activities. This session will distill the unique characteristics of clouds and describe how to best think about deployments in the c...
    Intel has just released Intel XML Software Suite 1.2. This latest release helps maximize XML performance, while minimizing the effort for any Enterprise, SOA, SaaS, and Web 2.0 based applications. Intel XML Software Suite 1.2 optimizes XML application performance, takes full advantage ...
    SYS-CON Events announced today that the leading global SOA, Virtualization, Cloud Computing and Open Source technology provider Intel named "Gold Sponsor" of SYS-CON's SOA World Conference & Expo which will take place November 19-21, 2008, at the Fairmont Hotel in the heart of Silicon ...
    SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
    SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
    Click to Add our RSS Feeds to the Service of Your Choice:
    Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
    myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
    Publish Your Article! Please send it to editorial(at)sys-con.com!

    Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


    SYS-CON FEATURED WHITEPAPERS


    ADS BY GOOGLE