YOUR FEEDBACK
Robert Z. Cashman wrote: I'll be the first one to cry foul once someone does something wrong with the pat...
Cloud Computing Conference
March 22-24, 2009, New York
Register Today and SAVE !..


2008 East
DIAMOND SPONSOR:
Data Direct
Frontiers in Data Access: The Coming Wave in Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
Intel
Virtualization – Path to Predictive Enterprise
Green Hills
IT Security in a Hostile World
JBoss / freedom oss
Practical SOA Approach
GOLD SPONSORS:
Software AG
The Art & Science of SOA: How Governance Enables Adoption
PlateSpin
Effective Planning for Virtual Infrastructure Growth
Fujitsu
Automated Business Process Discovery & Virtualization Service
Ceedo
Workspace Virtualization
Click For 2007 West
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TODAY'S TOP SOA & WEBSERVICES LINKS


Scripting in XML: A New Standard
Scripting in XML: A New Standard

As XML becomes accepted as the format for document markup, the demand for XML tools is also increasing. In particular, there's a need for a tool that can be easily programmed to perform any number of general reporting and editing operations on XML files. In the world of ASCII text, this role is fulfilled by scripting languages such as Perl and Python; such scripts sit behind many Web servers and are used in almost any situation where business requirements change so quickly that easy modification is of prime importance. It's entirely possible to use add-ons or modules that enable the use of Perl or Python with XML, but the fact remains that the native syntax of these languages has been very carefully designed for use with plain text. Also, if XML really is the ideal format for markup, the advantages of using XML syntax ought to apply just as much to a programming language as to documents.

One programming language that's written in XML, for XML, is XSLT (eXtensible Stylesheet Language for Transformations). This was originally designed as part of the XSL project, specifically for rearranging XML documents prior to their conversion into presentation formats such as PDF, HTML, and DVI. However, because it's written in XML and there's such a demand for a general-purpose XML scripting language, XSLT has been proposed for this role. There are, though, many aspects of XSLT that make it awkward to use for those who don't have a background in publishing and stylesheets. In particular, XSLT is based on a model in which a series of templates are used to describe rules for converting a read-only input document into an output document - XSLT does not contain variables in the traditional programming sense of the word. This is difficult for programmers accustomed to reading a document into a set of variables, modifying the variables as necessary, and then serializing them to a new document.

However, there's no reason why alternative languages should not be developed and used. One such language is XML Script designed by DecisionSoft Ltd. XML Script was designed as an XML-based alternative to Web-based scripting languages, such as Perl, PHP, and ColdFusion, and as such its "feel" is much more familiar to programmers used to a Web programming environment. However, the usefulness of the original version of XML Script (embodied in products such as X-Tract v1) was limited by the fact that it was designed before the advent of Namespaces, XPath, or many of the other technologies associated with XML.

This article seeks to show how a language built on XML Script - we call it XML Script 2 - could be developed that combines ease of use with XML technology.

In many respects XML Script 2 strives to be like a "normal" programming language, using the convention that elements in a script are analogous to function calls, and the attributes and content of that element provide parameters.

Listing 1 shows the ubiquitous "Hello, World!" program written in XML Script 2. Exactly where the string Hello, World! goes will depend on the processor. It's important to note that an XML Script processor could be a stand-alone command-line processor or a module as part of some distributed network-based system using DCOM, HTTP/CGI, or whatever. For the purposes of this article it's probably easiest to think of output being sent to stdout.

XML is also used for constructs that control program flow, such as if tests and for/foreach/while loops. The code shown in Listing 2 will print "Hello, World!" four times (the loop counter is inclusive).

While these examples demonstrate how a script is processed, what's particularly important is how an XML Script 2 processor stores its data. The processor maintains an internal representation of an XML document as a DOM-like tree or something similar, and this initially consists of just a single empty element. Other XML documents or even, potentially, fragments of documents can then be pasted into this dummy data document. The advantage of copying the content of all XML input files into the same internal document is that all the content can then be referred to using a tree navigation syntax such as XPath (XML Script 1 actually used its own syntax, but the precise details are not important here). So, given some files called foo.xml and bar.xml and a script similar to that shown in Listing 3, "Hello, World!" will be printed 20 times. The script reads the files foo.xml and bar.xml and pastes them into the internal data tree. The for command then runs a loop, first setting the counter equal to the value of the first attribute of the foo element and running until the counter is equal to the value of the last attribute of the bar element.

This example illustrates not only how XML Script 2 uses a tree to store programming variables, but also how certain commands (such as "for") can refer to those variables. Note: These are not read-only variables as they are in XSLT. Thus the script in Listing 4 reads the contents of foo.xml into the internal data tree and then alters the value of all attributes named attr1 that are part of the foo element or its children, setting them all equal to the text newvalue. Commands such as set are said to act in creative mode.

What's interesting is that creative mode commands will not only change XML content that's already present, but will also create new content where none previously existed. Thus the previous example will actually set an attr1 attribute on foo and all its descendant elements. This is analogous to the way Perl will automatically create variables whenever assignments are made. There obviously needs to be some restrictions on what can be created, to answer questions like "What happens if you use the XPath '//' as the target of a creative mode command?". Some simple rules that can be applied include:

  • Creative mode commands will only create elements that are descendants of elements that already exist.
  • Creative mode commands cannot create named nodes (elements or attributes) unless a name is supplied.
Other simple restrictions can also be thought of to deal with XPaths, including predicates and the like.

The example in Listing 4 doesn't actually do anything with the newly created data on the data tree, though; some general mechanism is needed to refer to the data tree from within a script. One way would be to use a command such as expr, which indicates that its contents are to be interpreted as an expression. This technique is used in Listing 5. It reads in data from foo.xml, alters all attr1 attributes, then outputs the foo element.

While the expr command will be the solution to most of these problems, expr elements cannot be included everywhere - in particular they can't be included in attributes. This may not seem like a problem; for example, in Listings 2 and 3 the from and to attributes of the for command are automatically treated as expressions. However, it would be a mistake if this were always the case. The data commands shown in Listings 3-5 shouldn't treat the value of its src attribute as an expression - instead it should be interpreted as a literal URI. Sometimes, though, an XML Script 2 programmer will want to decide the value of this string at runtime. This is possible using interpolations, as shown in Listing 6. The script first reads the contents of list.xml onto the data tree, "foreach"es over each of the file elements that have been placed on the data tree, then reads each of the listed files onto the data tree. It does this because the contents of the src attribute of the data element contain the special hash characters. In XML Script hashes indicate the start and end delimiters of interpolations - expressions that are evaluated at runtime to give text strings.

One important aspect of XML Script 2, included in both previous examples, is what happens when commands are nested within one another. It can be seen that elements that are children of output elements are themselves executed as commands before any output is produced. It's the result of processing all the children of output elements that's actually sent to the output of the processor. All commands have a defined result - in the case of the expr command the result is a list of nodes, whereas in the case of the data command the result is empty. On the other hand, the content of for or foreach elements isn't processed in quite this way, as a new result is calculated for each iteration of the loop. Exactly when or if the contents of a command are executed depends entirely on the command - this may seem inexact, but in fact it leads to a much more flexible and intuitive language.

While all of this is enough to make XML Script 2 useful as a tool for small scripts, for even medium-size projects it's essential to allow code to be modularized. In traditional programming languages this is accomplished by allowing programmers to write their own functions. XML Script 2 essentially allows the same thing, albeit in a slightly different way. All elements that are part of the script are treated as potential commands - the name (and namespace) of the element is looked up in an internal table called the method table to determine what to do when encountering that command. It's entirely possible to create new entries in the method table using the method command. This associates a particular element name with a particular block of XML Script code, which must have previously been created with a template command. The example shown in Listing 7 should print "Hello, World!" twice - once for each greeting element in the www.fred.com/ namespace. Note: It's also possible to use the process command to apply methods to XML content that's on the data tree (see Listing 8).

An alternative to adding user-defined methods from within a script is to add customized built-in methods to a particular XML Script processor. This is possible as long as the custom methods aren't in the www.xmlscript.org/2.0/ namespace, allowing people to use the commands they want from the XML Script 2 standard in combination with their own purpose-written code. As the basic framework of XML Script 2 is simple and straightforward, adding further methods should be easy.

The overall intention is that XML Script 2 should be both easy to use and very flexible, which is of vital importance in a marketplace that's becoming swamped with competing standards. The syntax presented in this article is still under development and probably not perfect, but we believe that ease of use is an important consideration in an XML world dominated by complex specifications.

Note: A family of XML Script 2 processors is under development by DecisionSoft Ltd. While DecisionSoft primarily creates server software for commercial use, the XML Script 2 specification isn't intended to be proprietary, and namespacing techniques can be used to separate extensions from core commands. At the time of writing, the specification is still under development, and any suggestions or comments should be addressed to xmlscript-editor@decisionsoft.com.

XML JOURNAL LATEST STORIES . . .
A round-up of the many themes and topics of interest to infrastructure architects, developers and IT managers featuring at SYS-CON's Cloud Computing Expo being held November 19-21, 2008 at The Fairmont Hotel in San Jose, California. The conference is expecting a record turnout of senio...
SYS-CON Events announced today that the leading global SOA, Virtualization, Cloud Computing and Open Source technology provider FreedomOSS named "Gold Sponsor" of SYS-CON's SOA World Conference & Expo which will take place November 19-21, 2008, at the Fairmont Hotel in the heart of Sil...
Cloud Computing offers significant benefits over traditional solutions for deploying production systems as well as for conducting development and testing activities. This session will distill the unique characteristics of clouds and describe how to best think about deployments in the c...
Intel has just released Intel XML Software Suite 1.2. This latest release helps maximize XML performance, while minimizing the effort for any Enterprise, SOA, SaaS, and Web 2.0 based applications. Intel XML Software Suite 1.2 optimizes XML application performance, takes full advantage ...
SYS-CON Events announced today that the leading global SOA, Virtualization, Cloud Computing and Open Source technology provider Intel named "Gold Sponsor" of SYS-CON's SOA World Conference & Expo which will take place November 19-21, 2008, at the Fairmont Hotel in the heart of Silicon ...
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON FEATURED WHITEPAPERS


ADS BY GOOGLE