Welcome!

Industrial IoT Authors: Elizabeth White, Stackify Blog, Yeshim Deniz, SmartBear Blog, Liz McMillan

Related Topics: Industrial IoT

Industrial IoT: Article

Transforming XML-to-XML or -Text or -HTML

XSLT morphs XML documents into other XML documents

The Extensible Stylesheet Language Transformations (XSLT) specification provides for morphing XML documents into other XML documents. An XML document can also be transformed into a format other than XML such as HTML or text. An XSLT processor is required for an XSLT transformation. Some of the commonly used XSLT processors include Xalan-Java, Oracle XSLT Processor for Java and the JAXP XSLT transformer. A stylesheet is used to transform an XML document. The elements of a stylesheet are in the XSLT namespace http://www.w3.org/1999/XSL/Transform.

With XSLT an XML document may be converted to another XML/HTML/text document.

The elements and attributes in an XML document get modified with an XSLT transformation.

In this tutorial some of the commonly required transformations are discussed. Having a Java application to transform an XML document with XSLT is a pre-requisite. An example XML document will be used as the basis for the XSLT transformations: catalog.xml, the example XML document, is shown in Listing 1.

Listing 1 catalog.xml


<?xml version="1.0" encoding="UTF-8"?>
<catalog
xmlns="http://www.w3.org/2001/XMLSchema-Instance">
  <journal title="Java Technology" publisher="IBM developerWorks">
    <article level="Intermediate"
      date="January-2004" section="Java Technology">
        <title>Service Oriented Architecture Frameworks</title>
        <author>Naveen Balani</author></article>

    <article level="Advanced" date="October-2003" section="Java Technology">
      <title>Advance DAO Programming</title>
      <author>Sean Sullivan</author> 
        </article>

  <article level="Advanced" date="May-2002" section="Java Technology">
     <title>Best Practices in EJB Exception Handling</title>
     <author>Srikanth Shenoy </author>  </article>

<article level="Advanced" date="May-2002" section="Java Technology">
     <title>Best Practices in EJB Exception Handling</title>
     <author>Srikanth Shenoy </author>  </article>
</journal>
</catalog>

The tutorial is structured into the following sections:

  • Identity Transformation
  • Removing Duplicates
  • Sorting Elements
  • Converting to HTML
  • Merging Documents
  • Obtaining Element/Attribute Values with XPath
  • Filtering Elements
  • Copying Nodes
  • Creating Elements and Attributes
  • Outputting an XML Document
The different sections explain only the XSLTs required to transform the example XML to XML/text/HTML documents. The procedure is explained in the frame:


Import the javax.xml.transform package.
import javax.xml.transform.*;
Import the javax.xml.parsers package.
import javax.xml.parsers.*;
Import the org.xml.sax and org.w3c.dom package classes.
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.w3c.dom.Document;
import org.w3c.dom.DOMException;

Import the DOMSource, Stream-Source and StreamResult classes.

import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamSource;
import javax.xml.transform.stream.StreamResult;

Create a DocumentBuilderFactory object.

DocumentBuilderFactory factory =
      DocumentBuilderFactory.newInstance();

Copy the example XML document catalog.xml to the c:/input directory. Create a DocumentBuilder object and parse the XML document to be transformed.


DocumentBuilder builder = factory.new
  DocumentBuilder();
            document = builder.parse(new
      File("c:/input/catalog.xml"));

Create a TransformerFactory object.

TransformerFactory tFactory =
         TransformerFactory.newInstance();

Copy the XSLT stylesheet to be used for transformation to the c:/input directory. Create a Transformer object from the XSLT to be used for transformation.


StreamSource stylesource = new StreamSource 
 (new File("c:/input/stylesheet.xslt");
            Transformer transformer =
    tFactory.newTransformer(stylesource);

Create a DOMSource object for the example
    XML Document object.

DOMSource source = new DOMSource(document);

Create a StreamResult object for the XSLT transformation output.

StreamResult result = new StreamResult(System.out);

Transform the example XML document with an XSLT.

transformer.transform(source, result);

The output from the XSLT transformation is displayed in the System.out.

Identity Transformation
The Identity transformation in XSLT copies the input XML document to the output document.

The structure and values of the elements and attributes in the XML document aren't modified.

An example XSLT for identity transformation is shown in Listing 2.

Listing 2 Identity Transformation Stylesheet


<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/
 Transform">
  <xsl:output method="xml" version="1.0" indent="yes"/>
    <xsl:template match="@* | node()">
    <xsl:copy>
<xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

The XPath expression '@*|node()' selects all the element and attribute nodes.

The identity transformation could be applied to modify the encoding, DOCTYPE or indentation.

Removing Duplicates
An XML document may have duplicate elements. The example XML document has duplicate "article" elements. The following XSLT outputs non-duplicate "article" titles.

Listing 3 Removing Duplicates


<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/
    Transform">
  <xsl:output method="xml" version="1.0" omit-xml-declaration="yes"/>
  <xsl:template match="/">
    <xsl:variable name="unique-list" select="//title[not(.=following::
      title)]" />
   <xsl:for-each select="$unique-list">
 <xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
   </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>
The XPath expression '//title[not(.=following::title)]'
selects non-duplicate 'title' elements.

The output from the XSLT is the non-duplicate article titles as in Listing 4.

Listing 4 Title Elements with Duplicates Removed


<title>Service Oriented Architecture Frameworks</title>
<title>Advance DAO Programming</title>
 <title>Best Practices in EJB Exception Handling</title>

In subsequent sections the XML document whose duplicate element have been removed will be used.

Sorting Elements
The XSLT xsl:sort is used to sort a group of elements. The attribute order of the xsl:sort element specifies the sorting order: ascending or descending. The data-type attribute (number or text) specifies the data type of the element to be sorted. For instance, the "title" elements in the example XML document are sorted with an XSLT, which is listed in Listing 5.

Listing 5 Sorting


<xsl:stylesheet 
xmlns:xsl="http://www.w3.org/1999/XSL/
 Transform"
     version="1.0">
  <xsl:output method="xml" omit-xml-
   declaration="yes"/>
  <xsl:template match="/catalog/journal">
    <xsl:apply-templates>
      <xsl:sort select="title" 
       order="ascending"/>
    </xsl:apply-templates>
  </xsl:template>
<xsl:template match="article">
  Title:   <xsl:apply-templates select="title"/>
  </xsl:template>
</xsl:stylesheet>

The output from the XSLT is a sorted list of article titles in ascending order. The Sorting Elements section in XSLT is also an example of XML-to-text transformation.

Listing 6 Sorted List


  Title:   Advance DAO Programming
  Title:   Best Practices in EJB Exception Handling
  Title: Service Oriented Architecture Frameworks

Conversion to HTML
The data in an XML document may have to be presented as an HTML document.

The following XSLT converts the example XML document to an HTML document.

Listing 7 Conversion to HTML


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/
 Transform">
<xsl:output   method="html"/>
<xsl:template match="/catalog/journal">
<html>
  <head>
    <title>Catalog</title>
  </head>
  <body>
    <table border="1" cellspacing="0">
        <tr>
         <th>Level</th>
         <th>Date</th>
         <th>Section</th>
         <th>Title</th>
         <th>Author</th>
        </tr>
      <xsl:for-each select="article">
        <tr>
         <td><xsl:value-of select="@
           level"/></td>
         <td><xsl:value-of select="
          @date"/></td>
         <td><xsl:value-of select="@
          section"/></td>
         <td><xsl:value-of select="title"
           /></td>
         <td><xsl:value-of select="author"
           /></td>
        </tr>
      </xsl:for-each>
    </table>
  </body>
</html>
</xsl:template>
</xsl:stylesheet>

The output of an XSLT is set to HTML with the method="html" attribute of the xsl:output element. The output from the XSLT is an HTML document as illustrated in Figure 1.

With an XSLT transformer that supports XHTML output, the output from an XSLT transformation can be set to XHTML instead of HTML by setting the xsl:output element attribute method="xml". For XHTML output, set the default namespace declaration for the XHTML namespace in the xsl:stylesheet element. The default namespace declaration is set as:


  <xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/
   Transform"
  xmlns="http://www.w3.org/1999/xhtml">

For output to an XHTML document set the doctype-public and doctype-system attributes of the xsl:output element as illustrated:


<xsl:output method="xml" doctype- system
 ="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
     doctype-public="-//W3C//DTD XHTML 1.0 
      Transitional//EN" />

Merging Documents
The document() function is used to refer to another XML document in an XML document.

With the document() function XML documents can be combined as illustrated in Listing 8.

Listing 8 Merging XML Documents


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/
 Transform">
<xsl:output method="xml" />
<xsl:template match="/">
<catalogs>
<xsl:copy-of select="*"/>
<xsl:copy-of select="document('catalog2.
  xml')"/>
</catalogs>
</xsl:template>
</xsl:stylesheet>

The XSLT combines the example XML document catalog.xml and another XML document catalog2.xml listed in Listing 9.

Listing 9 catalog2.xml


<?xml version="1.0" encoding="UTF-8"?>
<catalog
xmlns="http://www.w3.org/2001/XMLSchema- Instance">
  <journal title="Java Technology" 
   publisher="IBM developerWorks">
    <article level="Intermediate" date="February-2003">
     <title>Design XML Schemas Using UML</title>
         <author>Ayesha Malik</author>
      </article>
</journal>
</catalog>

The output from the XSLT is a combined XML document as shown in Listing 10.

Listing 10 Combined XML Document


<?xml version="1.0" encoding="UTF-8"?>
<catalogs><catalog>
  <journal title="Java Technology" 
   publisher="IBM developerWorks">
    <article level="Intermediate" 
     date="January-2004" section="Java 
      Technology">
<title>Service Oriented Architecture 
  Frameworks</title>
        <author>Naveen Balani</
        author>      </article>

    <article level="Advanced" date="
     October-2003" section="Java 
      Technology"> 
   <title>Advance DAO Programming</title> 
      <author>Sean Sullivan</author>  
       </article>  <article level="
        Advanced" date="May-2002" 
        section="Java Technology">
     <title>Best Practices in EJB 
      Exception Handling</title>
     <author>Srikanth Shenoy </author>  
      </article>  
</journal>
</catalog>
<catalog xmlns="http://www.w3.org/2001/
 XMLSchema-Instance">
  <journal title="Java Technology" 
   publisher="IBM developerWorks">
 
      <article level="Intermediate" 
       date="February-2003">   
        <title>Design XML Schemas Using UML
         </title> 
         <author>Ayesha Malik</author>  
      </article>
</journal>
</catalog></catalogs>

Obtaining Element/Attribute Values with XPath
XSLT supports XPath, which is used to select elements and attributes. For example, select the value of the "date" attribute for the article with the title Advance DAO Programming, and select the value of the "title" element for the article by the author Srikanth Shenoy. The XSLT illustrated in Listing 11 outputs the value of the "date" attribute and the "title" element.

Listing 11 XPath Expressions


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:
 xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-
 declaration="yes"/>
<xsl:template match="/catalog/journal">
Date: <xsl:value-of select="article[title='
 Advance DAO Programming']/@date"/>
Title: <xsl:value-of select="article[author='Srikanth Shenoy']/title"/>
</xsl:template>
</xsl:stylesheet>

The XPath expression 'article[title='Advance DAO Programming']/@date' selects the "date" attribute. The XPath expression 'article[author='Srikanth Shenoy']/title' selects the "title"' element.

The output from the XSLT is shown in Listing 12.

Listing 12 XPath Expressions Output


Date: October-2003
Title: Best Practices in EJB Exception  Handling

Filtering Elements
The elements in an XML document can be filtered by applying the xsl:apply-templates elements. For example, select the elements with the "level" attribute specified as "Intermediate." The XSLT in Listing 13 selects the "article" elements that has the "level" attribute value "Intermediate."

Listing 13 Selecting Elements


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:
 xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-
 declaration="yes"/>
<xsl:template match="/catalog/journal">
<xsl:apply-templates select="article[@level=
 'Intermediate']"/>
</xsl:template>
<xsl:template match="article">
Title: <xsl:value-of select="title"/>
Author: <xsl:value-of select="author"/>
</xsl:template>
</xsl:stylesheet>

The XPath expression 'article[@level='Intermediate']' selects the "article" elements with "level" attributes set to "Intermediate." The XSLT output contains only the "article" element with "level" value as "Intermediate."

Listing 14 Selected Element


Title: Service Oriented Architecture Frameworks
Author: Naveen Balani

Copying Nodes
The xsl:copy-of element copies the elements and attributes of the selected node. The XSLT in Listing 15 copies the "journal" node in the catalog.xml document to the output document.

Listing 15 Copying Nodes


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml"/>
<xsl:template match="/catalog">
<xsl:copy-of select="journal"/>
</xsl:template>
</xsl:stylesheet>

The output from the XSLT consists of the journal node from the input XML document.

Listing 16 Copied Node


<?xml version="1.0" encoding="UTF-8"?>
<journal title="Java Technology" 
 publisher="IBM developerWorks">
    <article level="Intermediate" 
     date="January-2004" section="Java 
      Technology">
<title>Service Oriented Architecture 
  Frameworks</title>
       author>Naveen Balani</author>  
        </article>

    <article level="Advanced" date="
     October-2003" section="Java 
     Technology"> 
      <title>Advance DAO Programming</
        title> 
      <author>Sean Sullivan</author>  
       </article>  <article 
        level="Advanced" date="May-2002" 
         section="Java Technology">
     <title>Best Practices in EJB 
      Exception Handling</title>
     <author>Srikanth Shenoy</author>  
      </article>  
</journal>

xsl:copy, a different version of the xsl:copy-of element, doesn't copy the sub-elements and attributes of the selected node.

Creating Elements and Attributes
The xsl:element element is used to create an element in an XML document. The xsl:attribute element is used to create an attribute in an XML document. The XSLT in Listing 17 creates the element "journal" and adds the attribute "publisher" to the "journal" element.

Listing 17 Creating an Element and an Attribute


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:
 xsl="http://www.w3.org/1999/XSL/
 Transform">
<xsl:output method="xml" omit-xml- 
 declaration="yes"/>
<xsl:template match="/">
<xsl:element name="journal">
<xsl:attribute name="publisher"><xsl:text>IBM developerWorks</xsl:text></xsl:
 attribute>
</xsl:element>
</xsl:template>
</xsl:stylesheet>

The output from the XSLT consists of the "journal" element with the "publisher" attribute.

Listing 18 Element/Attribute Created


<journal publisher="IBM developerWorks"/>

Outputting XSLT
XSLT output can be formatted with the xsl:output element. The encoding, the DOCTYPE declaration, and indentation can be set in the xsl:output element. The XSLT output in the section above can be formatted by setting the "encoding," "doctype-public" and "doctype-system" attributes of the xsl:output element. The "encoding" attribute sets the encoding of the output XML document, the "doctype-public" and "doctype-system" the DOCTYPE of the document, and the "omit-xml-declaration" attribute the inclusion of the XML declaration.

Listing 19 xsl:output


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:
 xsl="http://www.w3.org/1999/XSL/
 Transform">
<xsl:output method="xml" encoding="UTF-
 8"  omit-xml-declaration="no" doctype-
 public = "-//Sun Microsystems, Inc.//
 DTD Enterprise JavaBeans 2.0//EN" 
  doctype-system ="http://java.sun.com/
   dtd/ejb-jar_2_0.dtd" indent="yes"/>
<xsl:template match="/">
<xsl:element name="journal">
<xsl:attribute name="publisher"><xsl:text>IBM developerWorks</xsl:text></xsl:
 attribute>
</xsl:element>
</xsl:template>
</xsl:stylesheet>

The XML document output in the example XSLT consists of a DOCTYPE entity and the xml declaration, whose encoding attribute is set to 'utf-8'.

Listing 20 Output with Doctype Declaration


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE journal PUBLIC 
"-//Sun Microsystems, Inc.//DTD Enterprise JavaBeans 2.0//EN" 
 "http://java.sun.com/dtd/ejb-jar_2_0.dtd">
<journal publisher="IBM developerWorks"/>

Conclusion
An XML document can be transformed to another xml/text/html document by applying suitable XSLT transformations. In this tutorial the XSLTs required for transforming an XML-to-XML or -text or -HTML are discussed.

More Stories By Deepak Vohra

Deepak Vohra is a Sun Certified Java 1.4 Programmer and a Web developer.

More Stories By Ajay Vohra

Ajay Vohra is a senior solutions architect with DataSynapse Inc.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


IoT & Smart Cities Stories
CloudEXPO | DevOpsSUMMIT | DXWorldEXPO are the world's most influential, independent events where Cloud Computing was coined and where technology buyers and vendors meet to experience and discuss the big picture of Digital Transformation and all of the strategies, tactics, and tools they need to realize their goals. Sponsors of DXWorldEXPO | CloudEXPO benefit from unmatched branding, profile building and lead generation opportunities.
DXWorldEXPO LLC announced today that Big Data Federation to Exhibit at the 22nd International CloudEXPO, colocated with DevOpsSUMMIT and DXWorldEXPO, November 12-13, 2018 in New York City. Big Data Federation, Inc. develops and applies artificial intelligence to predict financial and economic events that matter. The company uncovers patterns and precise drivers of performance and outcomes with the aid of machine-learning algorithms, big data, and fundamental analysis. Their products are deployed...
All in Mobile is a place where we continually maximize their impact by fostering understanding, empathy, insights, creativity and joy. They believe that a truly useful and desirable mobile app doesn't need the brightest idea or the most advanced technology. A great product begins with understanding people. It's easy to think that customers will love your app, but can you justify it? They make sure your final app is something that users truly want and need. The only way to do this is by ...
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...
The challenges of aggregating data from consumer-oriented devices, such as wearable technologies and smart thermostats, are fairly well-understood. However, there are a new set of challenges for IoT devices that generate megabytes or gigabytes of data per second. Certainly, the infrastructure will have to change, as those volumes of data will likely overwhelm the available bandwidth for aggregating the data into a central repository. Ochandarena discusses a whole new way to think about your next...
Cell networks have the advantage of long-range communications, reaching an estimated 90% of the world. But cell networks such as 2G, 3G and LTE consume lots of power and were designed for connecting people. They are not optimized for low- or battery-powered devices or for IoT applications with infrequently transmitted data. Cell IoT modules that support narrow-band IoT and 4G cell networks will enable cell connectivity, device management, and app enablement for low-power wide-area network IoT. B...
The hierarchical architecture that distributes "compute" within the network specially at the edge can enable new services by harnessing emerging technologies. But Edge-Compute comes at increased cost that needs to be managed and potentially augmented by creative architecture solutions as there will always a catching-up with the capacity demands. Processing power in smartphones has enhanced YoY and there is increasingly spare compute capacity that can be potentially pooled. Uber has successfully ...
SYS-CON Events announced today that CrowdReviews.com has been named “Media Sponsor” of SYS-CON's 22nd International Cloud Expo, which will take place on June 5–7, 2018, at the Javits Center in New York City, NY. CrowdReviews.com is a transparent online platform for determining which products and services are the best based on the opinion of the crowd. The crowd consists of Internet users that have experienced products and services first-hand and have an interest in letting other potential buye...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...