YOUR FEEDBACK
NGASI Releases AppServer Manager 8.1
Dave Jenkins wrote: The remote server management is a welcomed added feature...
SOA World Conference
Virtualization Conference
$200 Savings Expire May 16, 2008... – Register Today!


2007 West
GOLD SPONSORS:
Active Endpoints
Your SOA Needs BPEL for Orchestration
BEA
Virtualized SOA: Adaptive Infrastructure for Demanding Applications
Nexaweb
Overcoming Bandwidth Challenges with Nexaweb
TIBCO
What is Service Virtualization?
SILVER SPONSORS:
WSO2
Using Web Services Technologies and FOSS Solutions
Click For 2007 East
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TODAY'S TOP SOA & WEBSERVICES LINKS


An Easy Introduction to XML Publishing - Part 3 of a Five-Part Series
Developing a new publishing system

Digg This!

Page 1 of 3   next page »

In Part 1 of this series we discussed some of the key problems of capturing and sharing information and in Part 2 we looked at the critical components of a solution: modularization, automation, and XML.

In part 3, we start getting technical - but in a nontechnical way. We examine the essential parts of building a solution, including developing data models (which are either DTDs or Schemas), designing stylesheets, and integrating various components of the solution.

Data Models? Why Would I Care About Data Models?
We're fond of history lessons in this column, so let's go back to the year 1900. At that time, the average worker in the United States earned about $500 a year and a bicycle cost $600. No wonder the men on old-fashioned bicycles wore a top hat and tails - you had to be rich to own one!

The reason for the high cost of bicycles is that they were handmade. They were built one at a time and required a tremendous amount of labor to handcraft the parts and painstakingly adjust them until they fit together. Repairs were expensive and time-consuming as well, since the parts had to be made and fitted by hand.

Twenty-five years later, a Model T from Ford cost $260 and median annual incomes had risen to over $1500. Within a single generation, manufactured transportation had gone from an impossible luxury to widely affordable.

Several manufacturing innovations were behind this remarkable change, including the moving assembly line, specialization of workers, and interchangeable parts. Later, automated machinery would replace the tedious, dangerous work that humans performed, further reducing costs, increasing quality, and expanding variety.

Applying these same principles - automation, specialization, and interchangeable parts - to the creation and sharing of information delivers the same kind of benefits. Whether you're planning to automate manufacturing or publishing, one of the keys is to design interchangeable parts that you can be confident will fit together easily when the time comes to assemble them.

That's where XML and data models come in. Whether a data model is based on a DTD (which stands for Document Type Definition) or a Schema (invented more recently than DTDs; see the article at www.arbortext.com/resources/xpn_june_03.html#tech for a comparison between DTDs and Schemas), the data model does for documents what design drawings do for interchangeable parts.

The data model describes all of the parts of a document along with the rules for how those parts may be combined. By following these rules when you create documents, software programs can automatically manipulate the documents later. Data models serve as the foundation of XML-based applications. All of the functionality in an XML publishing system rests on the data model. In most cases, if the data model changes, something else has to change as well. If you look at a data model in its raw form, whether it's a DTD or Schema, it looks scary. So we won't look. Instead, let's consider an abstract and highly simplified view in Figure 1.

You can see that a data model is like an organizational chart. The data model describes not only the parts (which we call "elements") of a document, such as chapter and section, but also the hierarchy (for example, a section always comes within a chapter).

You can also see that the data model prescribes the order of the elements. In the example above, the main parts of a Document are Foreword, Body, and Appendix, and they must come in that order.

Data models also prescribe how many of an element can appear, such as "exactly one," which you would want for the title of a chapter; "at least two," which you would want for the items in a list; or "any number," which you would want for paragraphs.

Gee, this seems pretty easy, doesn't it? That's only because we left out a lot of detail. So far, we showed examples of the organizational elements of a document, such as chapter and section. There are many reasons for capturing information into separate elements, such as:

  • Organization (chapter and section) - We have already seen elements of this type, which prescribe the basic structure of the document. Documents may be books, articles, catalogs, datasheets, and so on, and each of these typically has some unique characteristics in its structure. For example, a book may have chapter elements while a catalog may have price elements.
  • Formatting (emphasis) - Many elements exist only to make sure they have different formatting. For example, because emphasized words usually appear in italics, we designate an element for them such as emphasis. To provide a contrasting example, in virtually all cases we do not capture nouns or verbs as separate elements because we do not do anything different with them - they look the same as any other word in a sentence.

    Even though XML separates content from formatting so that your information exists independent of the way it's presented, you must consider your formatting goals while you design your data model. Too many times, we have seen organizations finalize their data models only to find that their stylesheets become very complex and expensive to develop and maintain, or they have to spend more time and money to revise their data models later, or they fail to meet all of their formatting design objectives.

  • Reuse (topic) - One of the most important benefits to be gained from an XML publishing system is the capability to reuse information in multiple documents. Approaches to reuse have varied widely, but one emerging best practice is to reuse information at a "topic" level rather than at a chapter or section level. Whether this works for you depends heavily on the specifics of your application, so this is a prime area for expert assistance.


Page 1 of 3   next page »

About PG Bartlett
PG Bartlett is vice president of product marketing at Arbortext, where he is responsible for corporate positioning, marketing strategy, and product direction. Bartlett joined Arbortext in 1994, bringing more than 18 years of experience in both technical and marketing positions at leading-edge high technology companies. He is a frequent presenter at major industry events and has been invited to speak and chair sessions at Comdex, Seybold Seminars, XML conferences, AIIM conferences, and others.

XML JOURNAL LATEST STORIES . . .
3rd International Virtualization Conference & Expo: Themes & Topics
From Application Virtualization to Xen, a round-up of the virtualization themes & topics being discussed in NYC June 23-24, 2008 by the world-class speaker faculty at the 3rd International Virtualization Conference & Expo being held by SYS-CON Events in The Roosevelt Hotel, in midtown
Red Hat Named "Platinum Sponsor" of Virtualization Conference & Expo
Red Hat is a trusted open source provider. Red Hat offers enterprise customers a long-term plan for building infrastructures on the quality and innovation of open source. Combining open source operating system platform, Red Hat Enterprise Linux, together with applications, management
JustSystems Contributes Key XBRL Rendering Technology to Financial Community
JustSystems announced that it is contributing intellectual property rights for its invention of eXtensible Business Reporting Language (XBRL) rendering technologies to XBRL International, the standards body responsible for the oversight of the XBRL specification. The invention, known a
JustSystems Launches Campaign for XBRL Success
JustSystems announced its campaign to help organizations adopt XBRL (eXtensible Business Reporting Language), the XML-based standard for communicating financial and business information. In related news, JustSystems also announced that it has contributed intellectual property rights of
Virtualization Meets DaaS - Desktop-as-a-Service
After a $1.5 million angel round, Desktone, which was started in 2006 by Eric Pulier, who also started SOA Software, US Interactive and IVT, picked up $17 million in first-round funding about a year ago from Highland Capital Partners, SoftBank Capital, Citrix Systems and the China-base
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON FEATURED WHITEPAPERS


ADS BY GOOGLE
BREAKING XML NEWS
RCG IT Addresses BI and SOA Convergence and Business Architecture at TDWI World Conference in Chicago
RCG Information Technology, Inc. (http://www.rcgit.com/) will participate in The Data Wareho