| By Deepak Vohra, Ajay Vohra | Article Rating: |
|
| March 18, 2005 12:00 AM EST | Reads: |
23,805 |
The Extensible Stylesheet Language Transformations (XSLT) specification provides for morphing XML documents into other XML documents. An XML document can also be transformed into a format other than XML such as HTML or text. An XSLT processor is required for an XSLT transformation. Some of the commonly used XSLT processors include Xalan-Java, Oracle XSLT Processor for Java and the JAXP XSLT transformer. A stylesheet is used to transform an XML document. The elements of a stylesheet are in the XSLT namespace http://www.w3.org/1999/XSL/Transform.
With XSLT an XML document may be converted to another XML/HTML/text document.
The elements and attributes in an XML document get modified with an XSLT transformation.
In this tutorial some of the commonly required transformations are discussed. Having a Java application to transform an XML document with XSLT is a pre-requisite. An example XML document will be used as the basis for the XSLT transformations: catalog.xml, the example XML document, is shown in Listing 1.
Listing 1 catalog.xml
<?xml version="1.0" encoding="UTF-8"?>
<catalog
xmlns="http://www.w3.org/2001/XMLSchema-Instance">
<journal title="Java Technology" publisher="IBM developerWorks">
<article level="Intermediate"
date="January-2004" section="Java Technology">
<title>Service Oriented Architecture Frameworks</title>
<author>Naveen Balani</author></article>
<article level="Advanced" date="October-2003" section="Java Technology">
<title>Advance DAO Programming</title>
<author>Sean Sullivan</author>
</article>
<article level="Advanced" date="May-2002" section="Java Technology">
<title>Best Practices in EJB Exception Handling</title>
<author>Srikanth Shenoy </author> </article>
<article level="Advanced" date="May-2002" section="Java Technology">
<title>Best Practices in EJB Exception Handling</title>
<author>Srikanth Shenoy </author> </article>
</journal>
</catalog>
The tutorial is structured into the following sections:
- Identity Transformation
- Removing Duplicates
- Sorting Elements
- Converting to HTML
- Merging Documents
- Obtaining Element/Attribute Values with XPath
- Filtering Elements
- Copying Nodes
- Creating Elements and Attributes
- Outputting an XML Document
Import the javax.xml.transform package.
import javax.xml.transform.*;
Import the javax.xml.parsers package.
import javax.xml.parsers.*;
Import the org.xml.sax and org.w3c.dom package classes.
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.w3c.dom.Document;
import org.w3c.dom.DOMException;
Import the DOMSource, Stream-Source and StreamResult classes.
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamSource;
import javax.xml.transform.stream.StreamResult;
Create a DocumentBuilderFactory object.
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
Copy the example XML document catalog.xml to the c:/input directory. Create a DocumentBuilder object and parse the XML document to be transformed.
DocumentBuilder builder = factory.new
DocumentBuilder();
document = builder.parse(new
File("c:/input/catalog.xml"));
Create a TransformerFactory object.
TransformerFactory tFactory =
TransformerFactory.newInstance();
Copy the XSLT stylesheet to be used for transformation to the c:/input directory. Create a Transformer object from the XSLT to be used for transformation.
StreamSource stylesource = new StreamSource
(new File("c:/input/stylesheet.xslt");
Transformer transformer =
tFactory.newTransformer(stylesource);
Create a DOMSource object for the example
XML Document object.
DOMSource source = new DOMSource(document);
Create a StreamResult object for the XSLT transformation output.
StreamResult result = new StreamResult(System.out);
Transform the example XML document with an XSLT.
transformer.transform(source, result);
The output from the XSLT transformation is displayed in the System.out.
Identity Transformation
The Identity transformation in XSLT copies the input XML document to the output document.
The structure and values of the elements and attributes in the XML document aren't modified.
An example XSLT for identity transformation is shown in Listing 2.
Listing 2 Identity Transformation Stylesheet
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/
Transform">
<xsl:output method="xml" version="1.0" indent="yes"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
The XPath expression '@*|node()' selects all the element and attribute nodes.
The identity transformation could be applied to modify the encoding, DOCTYPE or indentation.
Removing Duplicates
An XML document may have duplicate elements. The example XML document has duplicate "article" elements. The following XSLT outputs non-duplicate "article" titles.
Listing 3 Removing Duplicates
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/
Transform">
<xsl:output method="xml" version="1.0" omit-xml-declaration="yes"/>
<xsl:template match="/">
<xsl:variable name="unique-list" select="//title[not(.=following::
title)]" />
<xsl:for-each select="$unique-list">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
The XPath expression '//title[not(.=following::title)]'
selects non-duplicate 'title' elements.
The output from the XSLT is the non-duplicate article titles as in Listing 4.
Listing 4 Title Elements with Duplicates Removed
<title>Service Oriented Architecture Frameworks</title> <title>Advance DAO Programming</title> <title>Best Practices in EJB Exception Handling</title>
In subsequent sections the XML document whose duplicate element have been removed will be used.
Sorting Elements
The XSLT xsl:sort is used to sort a group of elements. The attribute order of the xsl:sort element specifies the sorting order: ascending or descending. The data-type attribute (number or text) specifies the data type of the element to be sorted. For instance, the "title" elements in the example XML document are sorted with an XSLT, which is listed in Listing 5.
Listing 5 Sorting
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/
Transform"
version="1.0">
<xsl:output method="xml" omit-xml-
declaration="yes"/>
<xsl:template match="/catalog/journal">
<xsl:apply-templates>
<xsl:sort select="title"
order="ascending"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="article">
Title: <xsl:apply-templates select="title"/>
</xsl:template>
</xsl:stylesheet>
The output from the XSLT is a sorted list of article titles in ascending order. The Sorting Elements section in XSLT is also an example of XML-to-text transformation.
Listing 6 Sorted List
Title: Advance DAO Programming Title: Best Practices in EJB Exception Handling Title: Service Oriented Architecture Frameworks
Conversion to HTML
The data in an XML document may have to be presented as an HTML document.
The following XSLT converts the example XML document to an HTML document.
Listing 7 Conversion to HTML
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/
Transform">
<xsl:output method="html"/>
<xsl:template match="/catalog/journal">
<html>
<head>
<title>Catalog</title>
</head>
<body>
<table border="1" cellspacing="0">
<tr>
<th>Level</th>
<th>Date</th>
<th>Section</th>
<th>Title</th>
<th>Author</th>
</tr>
<xsl:for-each select="article">
<tr>
<td><xsl:value-of select="@
level"/></td>
<td><xsl:value-of select="
@date"/></td>
<td><xsl:value-of select="@
section"/></td>
<td><xsl:value-of select="title"
/></td>
<td><xsl:value-of select="author"
/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
The output of an XSLT is set to HTML with the method="html" attribute of the xsl:output element. The output from the XSLT is an HTML document as illustrated in Figure 1.
With an XSLT transformer that supports XHTML output, the output from an XSLT transformation can be set to XHTML instead of HTML by setting the xsl:output element attribute method="xml". For XHTML output, set the default namespace declaration for the XHTML namespace in the xsl:stylesheet element. The default namespace declaration is set as:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/ Transform" xmlns="http://www.w3.org/1999/xhtml">
For output to an XHTML document set the doctype-public and doctype-system attributes of the xsl:output element as illustrated:
<xsl:output method="xml" doctype- system
="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
doctype-public="-//W3C//DTD XHTML 1.0
Transitional//EN" />
Merging Documents
The document() function is used to refer to another XML document in an XML document.
With the document() function XML documents can be combined as illustrated in Listing 8.
Listing 8 Merging XML Documents
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/
Transform">
<xsl:output method="xml" />
<xsl:template match="/">
<catalogs>
<xsl:copy-of select="*"/>
<xsl:copy-of select="document('catalog2.
xml')"/>
</catalogs>
</xsl:template>
</xsl:stylesheet>
The XSLT combines the example XML document catalog.xml and another XML document catalog2.xml listed in Listing 9.
Listing 9 catalog2.xml
<?xml version="1.0" encoding="UTF-8"?>
<catalog
xmlns="http://www.w3.org/2001/XMLSchema- Instance">
<journal title="Java Technology"
publisher="IBM developerWorks">
<article level="Intermediate" date="February-2003">
<title>Design XML Schemas Using UML</title>
<author>Ayesha Malik</author>
</article>
</journal>
</catalog>
The output from the XSLT is a combined XML document as shown in Listing 10.
Listing 10 Combined XML Document
<?xml version="1.0" encoding="UTF-8"?>
<catalogs><catalog>
<journal title="Java Technology"
publisher="IBM developerWorks">
<article level="Intermediate"
date="January-2004" section="Java
Technology">
<title>Service Oriented Architecture
Frameworks</title>
<author>Naveen Balani</
author> </article>
<article level="Advanced" date="
October-2003" section="Java
Technology">
<title>Advance DAO Programming</title>
<author>Sean Sullivan</author>
</article> <article level="
Advanced" date="May-2002"
section="Java Technology">
<title>Best Practices in EJB
Exception Handling</title>
<author>Srikanth Shenoy </author>
</article>
</journal>
</catalog>
<catalog xmlns="http://www.w3.org/2001/
XMLSchema-Instance">
<journal title="Java Technology"
publisher="IBM developerWorks">
<article level="Intermediate"
date="February-2003">
<title>Design XML Schemas Using UML
</title>
<author>Ayesha Malik</author>
</article>
</journal>
</catalog></catalogs>
Obtaining Element/Attribute Values with XPath
XSLT supports XPath, which is used to select elements and attributes. For example, select the value of the "date" attribute for the article with the title Advance DAO Programming, and select the value of the "title" element for the article by the author Srikanth Shenoy. The XSLT illustrated in Listing 11 outputs the value of the "date" attribute and the "title" element.
Listing 11 XPath Expressions
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns: xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" omit-xml- declaration="yes"/> <xsl:template match="/catalog/journal"> Date: <xsl:value-of select="article[title=' Advance DAO Programming']/@date"/> Title: <xsl:value-of select="article[author='Srikanth Shenoy']/title"/> </xsl:template> </xsl:stylesheet>
The XPath expression 'article[title='Advance DAO Programming']/@date' selects the "date" attribute. The XPath expression 'article[author='Srikanth Shenoy']/title' selects the "title"' element.
The output from the XSLT is shown in Listing 12.
Listing 12 XPath Expressions Output
Date: October-2003 Title: Best Practices in EJB Exception Handling
Filtering Elements
The elements in an XML document can be filtered by applying the xsl:apply-templates elements. For example, select the elements with the "level" attribute specified as "Intermediate." The XSLT in Listing 13 selects the "article" elements that has the "level" attribute value "Intermediate."
Listing 13 Selecting Elements
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns: xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" omit-xml- declaration="yes"/> <xsl:template match="/catalog/journal"> <xsl:apply-templates select="article[@level= 'Intermediate']"/> </xsl:template> <xsl:template match="article"> Title: <xsl:value-of select="title"/> Author: <xsl:value-of select="author"/> </xsl:template> </xsl:stylesheet>
The XPath expression 'article[@level='Intermediate']' selects the "article" elements with "level" attributes set to "Intermediate." The XSLT output contains only the "article" element with "level" value as "Intermediate."
Listing 14 Selected Element
Title: Service Oriented Architecture Frameworks Author: Naveen Balani
Copying Nodes
The xsl:copy-of element copies the elements and attributes of the selected node. The XSLT in Listing 15 copies the "journal" node in the catalog.xml document to the output document.
Listing 15 Copying Nodes
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml"/> <xsl:template match="/catalog"> <xsl:copy-of select="journal"/> </xsl:template> </xsl:stylesheet>
The output from the XSLT consists of the journal node from the input XML document.
Listing 16 Copied Node
<?xml version="1.0" encoding="UTF-8"?>
<journal title="Java Technology"
publisher="IBM developerWorks">
<article level="Intermediate"
date="January-2004" section="Java
Technology">
<title>Service Oriented Architecture
Frameworks</title>
author>Naveen Balani</author>
</article>
<article level="Advanced" date="
October-2003" section="Java
Technology">
<title>Advance DAO Programming</
title>
<author>Sean Sullivan</author>
</article> <article
level="Advanced" date="May-2002"
section="Java Technology">
<title>Best Practices in EJB
Exception Handling</title>
<author>Srikanth Shenoy</author>
</article>
</journal>
xsl:copy, a different version of the xsl:copy-of element, doesn't copy the sub-elements and attributes of the selected node.
Creating Elements and Attributes
The xsl:element element is used to create an element in an XML document. The xsl:attribute element is used to create an attribute in an XML document. The XSLT in Listing 17 creates the element "journal" and adds the attribute "publisher" to the "journal" element.
Listing 17 Creating an Element and an Attribute
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns: xsl="http://www.w3.org/1999/XSL/ Transform"> <xsl:output method="xml" omit-xml- declaration="yes"/> <xsl:template match="/"> <xsl:element name="journal"> <xsl:attribute name="publisher"><xsl:text>IBM developerWorks</xsl:text></xsl: attribute> </xsl:element> </xsl:template> </xsl:stylesheet>
The output from the XSLT consists of the "journal" element with the "publisher" attribute.
Listing 18 Element/Attribute Created
<journal publisher="IBM developerWorks"/>
Outputting XSLT
XSLT output can be formatted with the xsl:output element. The encoding, the DOCTYPE declaration, and indentation can be set in the xsl:output element. The XSLT output in the section above can be formatted by setting the "encoding," "doctype-public" and "doctype-system" attributes of the xsl:output element. The "encoding" attribute sets the encoding of the output XML document, the "doctype-public" and "doctype-system" the DOCTYPE of the document, and the "omit-xml-declaration" attribute the inclusion of the XML declaration.
Listing 19 xsl:output
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns: xsl="http://www.w3.org/1999/XSL/ Transform"> <xsl:output method="xml" encoding="UTF- 8" omit-xml-declaration="no" doctype- public = "-//Sun Microsystems, Inc.// DTD Enterprise JavaBeans 2.0//EN" doctype-system ="http://java.sun.com/ dtd/ejb-jar_2_0.dtd" indent="yes"/> <xsl:template match="/"> <xsl:element name="journal"> <xsl:attribute name="publisher"><xsl:text>IBM developerWorks</xsl:text></xsl: attribute> </xsl:element> </xsl:template> </xsl:stylesheet>
The XML document output in the example XSLT consists of a DOCTYPE entity and the xml declaration, whose encoding attribute is set to 'utf-8'.
Listing 20 Output with Doctype Declaration
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE journal PUBLIC "-//Sun Microsystems, Inc.//DTD Enterprise JavaBeans 2.0//EN" "http://java.sun.com/dtd/ejb-jar_2_0.dtd"> <journal publisher="IBM developerWorks"/>
Conclusion
An XML document can be transformed to another xml/text/html document by applying suitable XSLT transformations. In this tutorial the XSLTs required for transforming an XML-to-XML or -text or -HTML are discussed.
Published March 18, 2005 Reads 23,805
Copyright © 2005 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Deepak Vohra
Deepak Vohra is a Sun Certified Java 1.4 Programmer and a Web developer.
More Stories By Ajay Vohra
Ajay Vohra is a senior solutions architect with DataSynapse Inc.
- Publishing Synergy: Blog, Twitter and Ulitzer
- Will PR Firms Survive The New Media Avalanche?
- Typhoon Ondoy (Ketsana) Hits the Philippines (Part 2)
- Confessions of a Ulitzer Addict
- My Thoughts on Ulitzer
- Cloud Computing Expo 2010 East to Attract More Than 5,000 Delegates in New York City
- GITEX TECHNOLOGY WEEK 2009 Exhibitor Profiles
- Cloud Computing Journal Continues To Publish World's Best Cloud Analysts
- Are You Comfortable With Where Your Data Sleeps at Night?
- CIA Falls for Cloud Computing in a Big Way
- Managing Cloud Applications
- Dr. Leslie Lenert of CDC Speaks on Healthcare IT
- Cloud CEOs, CTOs & SVPs to Speak at 4th International Cloud Computing Expo
- Publishing Synergy: Blog, Twitter and Ulitzer
- Will PR Firms Survive The New Media Avalanche?
- Typhoon Ondoy (Ketsana) Hits the Philippines (Part 2)
- Confessions of a Ulitzer Addict
- My Thoughts on Ulitzer
- Combining the Cloud with the Computing: Application Delivery Networks
- Ulitzer vs. Ning
- Cloud Computing Expo 2010 East to Attract More Than 5,000 Delegates in New York City
- GITEX TECHNOLOGY WEEK 2009 Exhibitor Profiles
- Cloud Computing Journal Continues To Publish World's Best Cloud Analysts
- Are You Comfortable With Where Your Data Sleeps at Night?
- Where Are RIA Technologies Headed in 2008?
- AJAX World RIA Conference & Expo Kicks Off in New York City
- JSON vs XML - A Jason vs Freddie Sequel
- Processing XML with C# and .NET
- Has the Technology Bounceback Begun?
- BPEL Processes and Human Workflow
- The Top 250 Players in the Cloud Computing Ecosystem
- Open Source Database Special Feature: An Introduction to Berkeley DB XML
- "HP's Problem Ain't the SAP Install," Says Sun's Schwartz
- eXist - An Introduction To Open Source Native XML Database
- Digitizing the Planet: Google Earth vs MSN Virtual Earth vs MapQuest
- Generating XML from Relational Database Tables





























