YOUR FEEDBACK
Verizon Becomes a Counter-Android Linux Convert
JNels wrote: Hey - Jeffrey Nelson here at Verizon Wireless. Not a bit of ...
SOA World Conference
Virtualization Conference
$200 Savings Expire May 16, 2008... – Register Today!


2007 West
GOLD SPONSORS:
Active Endpoints
Your SOA Needs BPEL for Orchestration
BEA
Virtualized SOA: Adaptive Infrastructure for Demanding Applications
Nexaweb
Overcoming Bandwidth Challenges with Nexaweb
TIBCO
What is Service Virtualization?
SILVER SPONSORS:
WSO2
Using Web Services Technologies and FOSS Solutions
Click For 2007 East
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TODAY'S TOP SOA & WEBSERVICES LINKS


Managing XML Data
The Benefits to Office-Based Applications

Digg This!

Last week I had lunch with the application manager of a local customer that just completed their enterprise rollout of Office 2003. We had decided to meet and discuss possible ways his team could begin to utilize this deployment. As we sat down he explained that he had been talking to his team and had been investigating a project that he wanted to discuss. He explained that they had a variety of independent business processes that all ran within various Microsoft Office applications. He wanted to know if it were possible to connect these together using XML and the features of Office 2003. He explained that he had discovered Office natively supports XML, which had gotten him to think about ways his developers could take advantage of this feature. He was hoping leveraging this would enable him to connect these independent processes together and begin to share the various data that was collected throughout the enterprise. In this article I will explain, as I did that day, how you can use not only XML but many of the other associated standards such as Extensible Schema Language Templates (XSLT) and Extensible Schema Definitions (XSD) to build and integrate Office-based applications.

It is important to understand that one of the major benefits of an XML document is that it enables the separation of application data from presentation. An XML document contains a set of self-describing structures that are used to define a vocabulary of data. The text-based nature of XML enables the easy transport of these types of data documents across various process boundaries such as the ones the application manager had described. Always remember that XML is about data storage. This means that by definition XML documents can be unpredictable, as they are guaranteed to be well formed, but there is no inherent requirement of data consistency. This is the reason the XSD standard was developed. Additionally, as the need emerged for these documents to change easily, the XSLT standard was developed.

What Is XSD?
The simple answer for providing a guaranteed data structure is to create schemas. These schemas are used to describe an object and any of the interrelationships that exist within a data structure. There are many different kinds of schema definitions. For example, relational databases such as SQL Server use schemas to contain their table names, column keys, and provide a repository for trigger and stored procedures. Also when a developer creates a class definition, he or she can define schemas to provide the object-oriented interface to properties, methods, and events. Within an XML data structure, schemas are used to describe both the object definition and the relationship of data elements and attributes. Regardless of their actual context, schemas are used to provide the data representation and serve as an abstracted layer or framework.

Just as XML is really a metalanguage used to create and describe other languages, XSD is an example of an XML-based modeling language defined by the W3C for creating XML schemas. Defined using XML, XSD is used to enforce the legal building blocks for the formatting and validation of an XML file. For example, let's examine a schema that defines an employee structure as shown in Listing 1.

This schema is by definition a well-formed XML document. At the top of an XSD file is a set of namespaces. These are an optional set of declarations that provide a unique set of identifiers that associate a set of XML elements and attributes together. The original namespace in the XML specification was released by the W3C as a URI-based way to differentiate various XML vocabularies. This was then extended under the XML schema specification to include schema components and not just single elements and attributes. The unique identifier was redefined as a URI that doesn't point to a physical location, but to a security boundary that is owned by the schema author. The namespace is defined through two declarations - the XML schema namespace and target namespace. The xmlns attribute uniquely defines a schema namespace and is then divided into three sections.

  • Xmlns Keyword: Is defined first and separated from the target namespace prefix by a colon.
  • Prefix: Defines the abbreviated unique name of a namespace and is used when declaring all elements and attributes. Both xmlns and xml are reserved keywords that can't be used as valid prefixes.
  • Definition: The unique URI that identifies the namespace and contains the security boundary owned by the schema author.
By definition all XSD schemas contain a single top-level element. Underneath this element is the schema element that contains either simple or complex type elements. Simple elements contain text-only information. Complex elements are grouping elements that act as a container for other elements and attributes. There are four types of complex elements: empty elements, elements that contain other elements, elements that contain only text, and elements that contain both other elements and text.

The simple types contain the individual elements or fields that describe the employee object. These are then grouped into a complex type (employeeinfo) that provides the entire object representation. This schema contains a variety of elements that describes the data that can be used to capture employee information. By using the XML adapter of InfoPath as shown in Figure 1, we can import the schema into a data source.

By using InfoPath we can then build a data entry form as shown in Figure 2 that would allow end users to update employee information and also guarantee that their data conforms to the XSD structure defined above. The benefit of InfoPath in this example is that it abstracts users from having to understand the complexities and mechanics of the underlying XSD. Instead, they are able to open and complete the data entry form that results in a congruent XML document.

Additionally, depending on the specific business process, we could identify additional business rules using the features of InfoPath that could enforce form-specific requirements. It is important to remember that these are additional rules and can't alter the base line defined within the XSD.

As users create and save their forms, this data is then stored in an XML document (as shown in Listing 2) that is guaranteed to match the XSD schema defined above.

Note: Processing instructions (PI) are optional comment elements that can appear at the top of an XML document. InfoPath uses them to provide a path to the solution file and version information. Within the construct of XML they always being with a "?."

What Is XSLT?
Of course once this XML document is created, it contains the necessary information and associations to be opened within the InfoPath solution that we created above. Although it can be opened using Word as shown in Figure 3, we will only see the data that it contains and not the format or any additional rules we defined within InfoPath. However, by using XSLT we can change that and transform this document into a solution that can leverage the presentation capabilities provided by Microsoft Word.

XSLT is also a metalanguage that consists of an XML-based vocabulary that describes elements for transforming XML-based content. This vocabulary consists of a specialized set of elements or formatting objects that define presentation and document-based positional elements. Also, built into this positional location service is a search specification called the XML Path Language (XPATH). The combination of XSLT and XPATH forms a specialized vocabulary that enables the transformation of any XML-based document into virtually any other document format. XSLT is designed as a transformation language. Starting with an XML-based document, the application of templates can generate a new output document. The XSLT processor accepts as input the XML tree represented in a well-formed document and then produces as output a new transformed document.

The transformation process defines the use of three documents: the source, the XSLT style sheet, and the resulting document. The source document is simply a well-formed XML document. This document serves as the input of the transformation. The style sheet document is an XML document that uses the XSLT vocabulary for expressing transformation rules. Finally, the result document is a text document that is produced by applying the transformation defined in the XSLT stylesheet to the input document.

A transformation expressed in XSLT describes rules for transforming a source tree into a result tree. The transformation is achieved by associating a set of patterns with templates. A pattern is matched against elements in a source tree. A template is instantiated to create part of the result tree. It is important to remember that the result tree is separate from the source tree. In constructing the result tree, elements from the source tree can be filtered and reordered, and any type of arbitrary structure can be added.

Making the Transformation
Within the .NET Framework the System.Xml.Xsl namespace provides support for XSLT transformations. It supports the W3C XSL Transformations (XSLT) Version 1.0 Recommendation (www.w3.org/TR/xslt). This namespace provides several methods that enable developers to transform the document created by InfoPath into any other format. For example, using the code shown in Listing 3 we can apply an XSLT transformation to the InfoPath XML document and perform the transformation to a Word document.

The XML format defined within Word is based on an additional namespace support called WordML. The WordML schema was designed to mirror the information found in a traditional .doc file. The root element of a WordML document is always w:wordDocument. This element contains several other elements that represent the complete Word document structure, including properties, fonts, lists, systems, and the actual document body that contains the sections and paragraphs.

The addition of this namespace within an XML document preserves Word's styles and formatting in an XML namespace. This doesn't define presentation, but an important part of any Word document is formatting. This namespace allows the inclusion of these formats. For example, let's start with an InfoPath form that contains a repeating table structure as shown in Figure 4.

This could be transformed using XSLT into a Word document that includes formatting as shown in listing 4.

Once the transformation is complete, the document appears as a Word document within the File Explorer as shown in Figure 5.

.  .  .

As we stood to leave, he started smiling as he thought about the possibilities. As we shook hands and parted in the parking lot, each getting into his own car, he confided in me that he had all sorts of things he wanted to do. He couldn't wait to talk to his team and start planning. Of course, this is just a small introduction to the many things that XML, XSD, and XSLT can provide when working with Office; the rest is up to you.

About Thom Robbins
Thom Robbins is a senior technology specialist with Microsoft. He is a frequent contributor to various magazines, including .NET Developer's Journal and SOA Web Services Journal. Thom is also a frequent speaker at a variety of events that include VS Live and others. When he's not writing code and helping customers, he spends his time with his wife at their home in New Hampshire.

XML JOURNAL LATEST STORIES . . .
3rd International Virtualization Conference & Expo: Themes & Topics
From Application Virtualization to Xen, a round-up of the virtualization themes & topics being discussed in NYC June 23-24, 2008 by the world-class speaker faculty at the 3rd International Virtualization Conference & Expo being held by SYS-CON Events in The Roosevelt Hotel, in midtown
EDI to XML: A Practical Approach
While EDI transactions account for most worldwide commercial activity, XML-based alternatives are beginning to gain traction. According to Forrester Research, stateful XML, stateless XML, and even flat file exchanges are all projected to grow at a faster rate than EDI over the next few
Red Hat Named "Platinum Sponsor" of Virtualization Conference & Expo
Red Hat is a trusted open source provider. Red Hat offers enterprise customers a long-term plan for building infrastructures on the quality and innovation of open source. Combining open source operating system platform, Red Hat Enterprise Linux, together with applications, management
JustSystems Contributes Key XBRL Rendering Technology to Financial Community
JustSystems announced that it is contributing intellectual property rights for its invention of eXtensible Business Reporting Language (XBRL) rendering technologies to XBRL International, the standards body responsible for the oversight of the XBRL specification. The invention, known a
JustSystems Launches Campaign for XBRL Success
JustSystems announced its campaign to help organizations adopt XBRL (eXtensible Business Reporting Language), the XML-based standard for communicating financial and business information. In related news, JustSystems also announced that it has contributed intellectual property rights of
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON FEATURED WHITEPAPERS


ADS BY GOOGLE
BREAKING XML NEWS
IBM and HIPAAT Team to Give Patients Control Over Personal Health Information Access
IBM (NYSE: IBM) and HIPAAT Inc. (HIPAAT), the leading provider of consent management solutions