|
|
YOUR FEEDBACK
Did you read today's front page stories & breaking news?
SYS-CON.TV |
TODAY'S TOP SOA & WEBSERVICES LINKS Feature
Document XSLT Automatically
Digg This!
Business users spend a great deal of money on new software systems. For this they demand faithful implementation of their project objectives. And they expect enough visibility into an application to verify that their goals have been implemented. This visibility also ensures that changes can be identified to satisfy new business goals. One approach to meeting these objectives is the use of a formal specification language. The intent is that the increased formality in the specification will lead to an implementation closer to the goals of the business. Formal specification approaches include algebraic languages such as Z and diagrammatic languages such as UML. While many such languages have been developed, few can serve as a bridge between business users and software developers. This is due primarily to the large gap between domain concepts and software design tools. In addition, unless implementations can be created automatically from the specification languages, maintenance of the implementation often diverges quickly from its initial design. Consequently, most business users rely on documentation to explain the inner workings of a system. This addresses the critical need to gain visibility into the system to ensure that their goals have been met and the system can be changed easily to adjust to new business concerns. The value of documentation depends on:
This article discusses a method of automatically documenting, in domain-specific terms, the behavior of conditional text processing applications. The use of such terms, as well as actual text in its domain-specific format, yields a small gap that can be readily bridged by business users. The documentation presents a form that can be marked up by business users with minimal ambiguity. Automatic generation of the documentation ensures that it remains faithful to each build of the application. Conditional text processing is a very large horizontal application area with potential impact on much of literate society. It affects areas as diverse as traditional print document production, Web page generation, document personalization, targeted advertising, and access-controlled documents. Customized text processing is bound to increase rapidly with the trend toward information delivery that is increasingly personalized, access controlled, and market-segment specific.
Representing Text
In the domain of technical documentation, DocBook is a well-accepted XML application that can be used as an intermediate form for generating text in a variety of formats including plaintext, XHTML, RTF, TeX, PDF, and PostScript. DocBook is directed at the production of articles or books. Common tags include <book>, <chapter>, and <para>. Listing 1 presents the skeleton of a document in DocBook format. The document is intended to represent a fragment from a financial planning document.
XSLT for Conditional Text
Listing 2 presents an XML data document that includes data about an individual customer. While we won't present the details of the XML DTD (or XML Schema) defining the document structure, it should be evident that it represents properties of a single individual such as age and estimated financial net worth. It's assumed that some other calculation process has determined this information based on data about the individual's financial status. Listing 3 presents XSLT markup that's been added to Listing 1. The markup includes both a simple data substitution and a simple conditional statement. It uses XPath, another W3C standard and part of XSL, to refer to the data in the XML data document. The XPath reference "customer/name" refers to the customer's name in the XML document given in Listing 2. Listing 4 illustrates the resulting DocBook document produced by the XSLT transformation of Listing 3 to the XML data in Listing 2. We've used Michael Kay's Saxon processor to generate this example and have rendered this DocBook document into XHTML (see Figure 1) to show how it can be presented to the business user.
Documenting XSLT Processing
Listing 5 presents the XSLT program that maps the XSLT program of Listing 3 into a target DocBook document. As this is the most critical step in the process, we elaborate on this example - line numbers have been added to the left-hand side of the listing for reference. Line 1 simply identifies the file as an XML document in a Latin1 character encoding. Line 2 declares an XSL transformation using the standard XSL namespace. The next two lines specify the public and system identifiers that the output file should include to identify the document as a DocBook file. Line 6 indicates that excess white space is to be stripped out of all elements. The remainder of Listing 5 consists of five templates (i.e., specific transformations). The template at Line 7 matches those XSL elements whose content should be processed further but without any special consideration due the containing XSL element. The Line 10 template ignores all markup inside the xsl:output elements. The template at Line 11 performs the first actual metamarkup of the output document by outputting (source-data-name) for each fragment of text to be pulled from <customer> input data. Similarly, at Line 18, metamarkup of the form (IF condition text) is generated to express the condition for which text should be included in the output document. The final template at Line 26 is a common XSLT default processing rule that simply copies unmatched markup to the output file. Listing 6 presents a mapping from XSLT variable names to domain-specific names used by business users. This step is syntactic sugar for increasing readability beyond that provided by clearly named XML elements in the input data file. Listing 5 Line 32 includes routine XSLT code that performs this mapping. In production, this mapping was performed in a postprocessing phase via a Perl script. Listing 7 presents the DocBook markup produced by applying Listing 5 to Listings 3 and 6. In practice, we've termed this a specification because it precisely specifies the operation of the XSLT program in producing the target documents. Figure 2 illustrates the XHTML presentation of the DocBook document presented in Listing 7. Two extra files (autodocxslt.xsl and mapnames.xsl) that support the build process are too insignificant to warrant including in the listings. However, they can be downloaded from www.sys-con.com/xml/sourcec.cfm so interested readers can build the examples. (A README file that explains all the files is included.)
Discussion
However, a more general treatment of the subject is difficult because the method would need to account for the appearance of any XSLT element in any textual context. In practice, the XSLT documentation program has been developed by accounting for every XSLT construct used along with every DocBook context it appears in. To ensure that we've accounted for all possibilities, each specification is tested using James Clark's nsgmls validator to ensure that the markup conforms to the DocBook DTD. Because of this constraint, it isn't sufficient to have a properly working XSLT program - it must be translatable into valid DocBook as well. If DocBook directly supported metamarkup constructs, or if we targeted a different output format that provided such direct support, the challenge of choosing output representations for metalevel markup while simultaneously maintaining validity could have been avoided. Another aspect of this method that requires attention is the translation of expressions used in conditionals, loops, and other XSL statements. The method assumes that all expressions can be transformed by a simple replacement of XPath references into short, descriptive English names. If the expressions are more complex, additional processing may be needed on the expression to render it into readable form. For example, if a call is made to the XSL "format-number" function to depict a number as a dollar amount, then a U.S business user would prefer to see the expression rendered with a leading dollar sign:
format-number("customer/netWorth", #,###,###.00)
Conclusions
For More Information
This work was supported by ExpLore Reasoning Systems, Inc., a firm specializing in intelligent systems for financial services applications. XML JOURNAL LATEST STORIES . . .
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
|
SYS-CON FEATURED WHITEPAPERS MOST READ THIS WEEK BREAKING XML NEWS
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||