Welcome!

XML Authors: Lori MacVittie, Martin Ingram, Tong Liu, Gilad Shainer, Brian Sparks

Related Topics: XML

XML: Article

XML Schema Best Practices

XML Schema Best Practices

In the June issue of XML-Journal I mentioned that we need a set of best practices that rein in the complexities of XML Schema. The set offered at www.xfront.com is a great start, but they cater to the XML Schema extremists, and I'd like to modify them, offering some alternative best practices for "the rest of us."

You'll have to refer to www.xfront.org/BestPracticesHomepage.html to get a full description of the issues discussed below. Following are some ground rules I used to build my "modified" best practices list:

  • The Over 10 Page Rule (acronym: O10P Rule): Any "Best Practice" that takes more than 10 pages to describe shouldn't be a best practice.
  • The Safe and Sane Use of Namespaces Rule (acronym: SASUONS Rule): This rule is applied as needed to maintain the sanity of the schema developer with respect to the use of namespaces.

Best Practice #1
Issue: When should a schema be designed to hide (localize) within the schema the namespaces of the elements and attributes it is using, versus when should it be designed to expose the namespaces in instance documents?

Well, this best practice wins a prize, in that it triggers BOTH the O10P rule AND the SASUONS Rule!
Conclusion: Always use elementFormDefault="qualified" (and attributeFormDefault="unqualified") in your schemas. It's the only sane way to go.

Best Practice #2
Issue: When should an element or type be declared global, versus when should it be declared local?

Here's a case where XML Schema gives us too many choices, making it too confusing, without really giving any bang for the buck. My recommendation is to always declare elements (and attributes) locally, and always declare types globally. The exception is that root element must be declared globally.

Best Practice #3
Issue: When should an item be declared as an element versus when should it be defined as a type?

This best practice needs to be removed from the list. Elements and types are disjoint schema components. You need to declare an element when you need to declare an element! That is, an element declaration is needed for every element found in the instance document. You need to declare a type when you need to declare a type. You need to declare a type when you are constructing content models.

Best Practice #4
Issue: In a project where multiple schemas are created, should we give each one a different targetNamespace, or should we give all the schemas the same targetNamespace, or should some of them have no targetNamespace?

This definitely triggers the SASUONS Rule, just by the description above.
Conclusion: Give each vocabulary a separate targetNamespace, except when you benefit by breaking down a very large vocabulary into multiple physical schema documents.

Best Practice #5
Issue: What's the best practice for implementing a container element that's to be composed of variable content?

This best practice triggers the O10P Rule. Again, XML Schema is just too complex. I'm against using substitution groups, abstract elements, xsi:type, and complexType inheritance. I just think they add too much confusion to schema development, and aren't worth the pain.
Conclusion: Go with the proposed method 2, the element - and I'd throw in the liberal use of model groups. Container elements can be created that reference (reuse) model groups, combined together via the <choice> element.

Best Practice #6
Issue: Should you design your schemas to build type hierarchies (design by subclassing), or should you design them to aggregate components (design by composition)?

I agree with the conclusion at xfront.org here: design by composition is the preferred approach. However, I'd add a recommendation to use XML Schema model groups as the preferred way to design a content model via composition. ComplexType inheritance is overused and broken. Use simpleType as needed.

Best Practice #7
Issue: What's the best practice for creating extensible content models?

I like the question, but don't like the answers proposed at xfront.com for this one. There are two proposals: (1) use complexType inheritance and xsi:type, and (2) use the element. I don't like either complexType inheritance or xsi:type. Use of the element, as discussed at xfront.com, has nondeterministic problems.
Conclusion: Use XML Schema Model (and Attribute) groups for creating extensible content models. These offer extensible content models for elements and attri- butes without any of the problems associated with complexType inheritance or xsi:type.

* * *
These modified best practices should enable you to design more understandable XML Schemas.

More Stories By Tom Gaven

Tom Gaven lives in northern Virginia, and has developed and delivered training on many different technologies. He has authored over 30 courses, including Assembler, C, C++, Java, OS/2, and Windows. He also authored MindQ's Developer Training for Java program. In the last 2 years, he has been architecting and developing products with XML, XSLT, XML Schema, RELAX NG, Java, and Schematron. Tom is currently working on tools and courseware to make XML easier to use. See http://www.xmldistilled.com for more information.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.