YOUR FEEDBACK
Jeremy Geelan wrote: As mentioned in the Call for Papers we particularly welcome speaking proposals o...
AJAXWorld RIA Conference
$300 Savings Expire August 29
Register Today and SAVE!


2008 East
DIAMOND SPONSOR:
Data Direct
Frontiers in Data Access: The Coming Wave in Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
Intel
Virtualization – Path to Predictive Enterprise
Green Hills
IT Security in a Hostile World
JBoss / freedom oss
Practical SOA Approach
GOLD SPONSORS:
Software AG
The Art & Science of SOA: How Governance Enables Adoption
PlateSpin
Effective Planning for Virtual Infrastructure Growth
Fujitsu
Automated Business Process Discovery & Virtualization Service
Ceedo
Workspace Virtualization
Click For 2007 West
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TODAY'S TOP SOA & WEBSERVICES LINKS


An Easy Introduction to XML Publishing - Part 3 of a Five-Part Series
Developing a new publishing system

In Part 1 of this series we discussed some of the key problems of capturing and sharing information and in Part 2 we looked at the critical components of a solution: modularization, automation, and XML.

In part 3, we start getting technical - but in a nontechnical way. We examine the essential parts of building a solution, including developing data models (which are either DTDs or Schemas), designing stylesheets, and integrating various components of the solution.

Data Models? Why Would I Care About Data Models?
We're fond of history lessons in this column, so let's go back to the year 1900. At that time, the average worker in the United States earned about $500 a year and a bicycle cost $600. No wonder the men on old-fashioned bicycles wore a top hat and tails - you had to be rich to own one!

The reason for the high cost of bicycles is that they were handmade. They were built one at a time and required a tremendous amount of labor to handcraft the parts and painstakingly adjust them until they fit together. Repairs were expensive and time-consuming as well, since the parts had to be made and fitted by hand.

Twenty-five years later, a Model T from Ford cost $260 and median annual incomes had risen to over $1500. Within a single generation, manufactured transportation had gone from an impossible luxury to widely affordable.

Several manufacturing innovations were behind this remarkable change, including the moving assembly line, specialization of workers, and interchangeable parts. Later, automated machinery would replace the tedious, dangerous work that humans performed, further reducing costs, increasing quality, and expanding variety.

Applying these same principles - automation, specialization, and interchangeable parts - to the creation and sharing of information delivers the same kind of benefits. Whether you're planning to automate manufacturing or publishing, one of the keys is to design interchangeable parts that you can be confident will fit together easily when the time comes to assemble them.

That's where XML and data models come in. Whether a data model is based on a DTD (which stands for Document Type Definition) or a Schema (invented more recently than DTDs; see the article at www.arbortext.com/resources/xpn_june_03.html#tech for a comparison between DTDs and Schemas), the data model does for documents what design drawings do for interchangeable parts.

The data model describes all of the parts of a document along with the rules for how those parts may be combined. By following these rules when you create documents, software programs can automatically manipulate the documents later. Data models serve as the foundation of XML-based applications. All of the functionality in an XML publishing system rests on the data model. In most cases, if the data model changes, something else has to change as well. If you look at a data model in its raw form, whether it's a DTD or Schema, it looks scary. So we won't look. Instead, let's consider an abstract and highly simplified view in Figure 1.

You can see that a data model is like an organizational chart. The data model describes not only the parts (which we call "elements") of a document, such as chapter and section, but also the hierarchy (for example, a section always comes within a chapter).

You can also see that the data model prescribes the order of the elements. In the example above, the main parts of a Document are Foreword, Body, and Appendix, and they must come in that order.

Data models also prescribe how many of an element can appear, such as "exactly one," which you would want for the title of a chapter; "at least two," which you would want for the items in a list; or "any number," which you would want for paragraphs.

Gee, this seems pretty easy, doesn't it? That's only because we left out a lot of detail. So far, we showed examples of the organizational elements of a document, such as chapter and section. There are many reasons for capturing information into separate elements, such as:

  • Organization (chapter and section) - We have already seen elements of this type, which prescribe the basic structure of the document. Documents may be books, articles, catalogs, datasheets, and so on, and each of these typically has some unique characteristics in its structure. For example, a book may have chapter elements while a catalog may have price elements.
  • Formatting (emphasis) - Many elements exist only to make sure they have different formatting. For example, because emphasized words usually appear in italics, we designate an element for them such as emphasis. To provide a contrasting example, in virtually all cases we do not capture nouns or verbs as separate elements because we do not do anything different with them - they look the same as any other word in a sentence.

    Even though XML separates content from formatting so that your information exists independent of the way it's presented, you must consider your formatting goals while you design your data model. Too many times, we have seen organizations finalize their data models only to find that their stylesheets become very complex and expensive to develop and maintain, or they have to spend more time and money to revise their data models later, or they fail to meet all of their formatting design objectives.

  • Reuse (topic) - One of the most important benefits to be gained from an XML publishing system is the capability to reuse information in multiple documents. Approaches to reuse have varied widely, but one emerging best practice is to reuse information at a "topic" level rather than at a chapter or section level. Whether this works for you depends heavily on the specifics of your application, so this is a prime area for expert assistance.
About PG Bartlett
PG Bartlett is vice president of product marketing at Arbortext, where he is responsible for corporate positioning, marketing strategy, and product direction. Bartlett joined Arbortext in 1994, bringing more than 18 years of experience in both technical and marketing positions at leading-edge high technology companies. He is a frequent presenter at major industry events and has been invited to speak and chair sessions at Comdex, Seybold Seminars, XML conferences, AIIM conferences, and others.

XML JOURNAL LATEST STORIES . . .
Two of the biggest launches in Rich Internet Application history took place in 2007/2008 when Adobe launched AIR 1.0 in February '08 and Microsoft launched Silverlight (September '07). At the 6th International AJAXWorld RIA Conference & Expo in October SYS-CON Events is delighted to be...
Red Hat CTO Brian Stevens, Citrix CTO Simon Crosby, Egenera CTO Pete Manca, Allen Stewart, Group Manager, Windows Virtualization at Microsoft, and Brian Duckering, Sr. Director of Products and Alliances at Symantec were the top industry executives who joined Jeremy Geelan in the 4th Fl...
This article is aimed at beginner and intermediate Web developers looking to make the leap into database support of their Web site. The article suggests a new declarative language based on HTML-forms, which is used for development of the database interface. HTML forms can manage not on...
ISO said Friday that the appeals made by Brazil, India, South Africa and Venezuela protesting the standardization of Microsoft’s Office Open XML (OOXML) file format hadn’t gone anywhere – it was unclear whether any of them had any standing anyway – but since they “failed to g...
Since its inception, XML has been criticized for the overhead it introduces into the enterprise infrastructure. Business data encoded in XML takes five to 10 times more bandwidth to transmit in the network and proportionally more disk space to store.
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON FEATURED WHITEPAPERS


ADS BY GOOGLE
BREAKING XML NEWS
Altova® ( http://www.altova.com ), creator of XMLSpy®, the industry leading XML editor, and other ...