YOUR FEEDBACK
More on the Software Assembly Question - Do Design Patterns Help?
Yanic wrote: Hi, > UML and MDA are being changed to be more data and doc...
SOA World Conference
Virtualization Conference
$50 Savings Expire May 23, 2008... – Register Today!


2007 West
GOLD SPONSORS:
Active Endpoints
Your SOA Needs BPEL for Orchestration
BEA
Virtualized SOA: Adaptive Infrastructure for Demanding Applications
Nexaweb
Overcoming Bandwidth Challenges with Nexaweb
TIBCO
What is Service Virtualization?
SILVER SPONSORS:
WSO2
Using Web Services Technologies and FOSS Solutions
Click For 2007 East
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TODAY'S TOP SOA & WEBSERVICES LINKS


RELAX NG: The Power Is in the Patterns

Digg This!

Schema languages are languages that allow you to specify the structure of XML instance documents. RELAX NG (see www.relaxng.org) is an XML schema language that is considered to be simple, yet powerful. This article gives an overview of an important concept of the RELAX NG schema language called patterns. The power of RELAX NG can be found in its patterns.

Schema languages also describe the allowed names of elements and attributes that are found in XML instance documents. And they allow you to specify element ordering, occurrence, and allowed content, like simple text, or datatypes, like integers. Some examples of schema languages are W3C XML Schema, RELAX NG, Schematron, and DTD.

RELAX NG differs from other schema languages in that it's built around the concept of patterns. To understand the power of RELAX NG, you must first understand the basic RELAX NG patterns and how they can be combined.

Let's begin by taking a look at the following XML instance document:

<document pages="1" year="2002" > <title>Patterns are fun <author>Tom Gaven</author> <para> This is the <b>first</b> paragraph. </para> <para> And this is the <b>second</b>. </para> <prints>5 60 45</prints> </document>
This XML instance document contains elements and text. To give you an example of the RELAX NG patterns found in this XML instance document, we could say (ignoring attributes) that the pattern for the document element would be element title, followed by element author, followed by one or more para elements. The pattern for the para element would be a mixture of text and element b. The title, author, and b elements all have a similar pattern, text.

Listing 1 is a complete RELAX NG schema document that shows this pattern syntax. The syntax used in the listing is the compact syntax for RELAX NG. This schema can be used to validate the instance document. Line 1 is a top-level declaration that specifies we're using XML Schema datatypes tied to the "xsd" prefix. The keyword "start" specifies the root element of the instance document. (Datatypes, start, element, attribute, text, and list are all keywords.) RELAX NG's compact syntax uses the same occurrence symbols as DTDs (+,?,*).

The RELAX NG patterns are found to the right of the equals (=) sign. Definition names are to the left. Note that patterns can be combined, like the pattern "pages, year, title, author, para+". The comma specifies an ordering of the patterns. You can also group (or nest) patterns using parentheses. The ( text | b )* pattern specifies zero or more sets of text or b elements.

Patterns
Element and attribute patterns

The first two patterns we'll discuss are the patterns for elements and attributes. Following is the (compact) syntax for the element and attribute patterns:

P1. element nameClass { pattern }
P2. attribute nameClass { pattern }

These two patterns are recursive in that they allow other nested patterns. A nameClass is another great feature of RELAX NG, but for now just think of the nameClass as simply the name of the element or attribute. In the schema document lines 3-10 all use the element or attribute pattern.

Datatype patterns
P3. datatypeName params?
P4. datatypeName? DatatypeValue
P5. "text"

Pattern P3 is used on lines 4 and 10 in the schema document. It allows you to specify the datatype allowed for the content of an element or attribute.

Pattern P4 is used on line 5. The text "2002" is the DatatypeValue field. This specifies that the year element must have the content "2002". Pattern P5 is the text pattern. The keyword "text" specifies that text content is allowed.

Ref (Reference) patterns
P6. ref
P7. "(" pattern ")"

Pattern P6 allows patterns to reference definitions, which in turn define other patterns. The ref pattern is used in the schema document in Listing 1 in lines 2, 3, and 8. For example, the "b" reference on the right side of line 8 references the definition (b) on the left side of line 9. Pattern P7 allows patterns to be nested inside parentheses (see line 8 in the schema).

List pattern
P8. list { pattern }

The list pattern allows lists of content (separated by white space) to be specified. Line 10 in the schema specifies that the prints element can contain a "list of integers."

More patterns
Overall, 14 different patterns are supported by RELAX NG, eight of which are discussed above. The other available patterns are mixed, empty, notAllowed, grammar, parent ref, and external ref. The first three allow more controls on content; the last three allow for file and grammar modularity.

Power in Combinations
Individually, RELAX NG's patterns are powerful, but you can also combine them to handle more complex validation scenarios.

Combined patterns
x = element x { ( list { xsd:integer } | b ) }
b = element b { text }

Element x contains either a list of integers or an element b.

x = element x { "1.0" | "NaN" | num }
num = element num { text }

Element x contains either the string "1.0", the string "NaN", or a num element.

x = element x { (a,b)|(a,c) }
a = element a { text }
b = attribute b { text }
c = attribute c { text }

Element x contains (element a and attribute b) OR (element a and attribute c).

RELAX NG patterns are quite powerful, yet easy to learn and use. With their ability to combine, they offer capability not found in either DTDs or XML Schema.

About Tom Gaven
Tom Gaven lives in northern Virginia, and has developed and delivered training on many different technologies. He has authored over 30 courses, including Assembler, C, C++, Java, OS/2, and Windows. He also authored MindQ's Developer Training for Java program. In the last 2 years, he has been architecting and developing products with XML, XSLT, XML Schema, RELAX NG, Java, and Schematron. Tom is currently working on tools and courseware to make XML easier to use. See http://www.xmldistilled.com for more information.

XML JOURNAL LATEST STORIES . . .
3rd International Virtualization Conference & Expo: Themes & Topics
From Application Virtualization to Xen, a round-up of the virtualization themes & topics being discussed in NYC June 23-24, 2008 by the world-class speaker faculty at the 3rd International Virtualization Conference & Expo being held by SYS-CON Events in The Roosevelt Hotel, in midtown
EDI to XML: A Practical Approach
While EDI transactions account for most worldwide commercial activity, XML-based alternatives are beginning to gain traction. According to Forrester Research, stateful XML, stateless XML, and even flat file exchanges are all projected to grow at a faster rate than EDI over the next few
Red Hat Named "Platinum Sponsor" of Virtualization Conference & Expo
Red Hat is a trusted open source provider. Red Hat offers enterprise customers a long-term plan for building infrastructures on the quality and innovation of open source. Combining open source operating system platform, Red Hat Enterprise Linux, together with applications, management
JustSystems Contributes Key XBRL Rendering Technology to Financial Community
JustSystems announced that it is contributing intellectual property rights for its invention of eXtensible Business Reporting Language (XBRL) rendering technologies to XBRL International, the standards body responsible for the oversight of the XBRL specification. The invention, known a
JustSystems Launches Campaign for XBRL Success
JustSystems announced its campaign to help organizations adopt XBRL (eXtensible Business Reporting Language), the XML-based standard for communicating financial and business information. In related news, JustSystems also announced that it has contributed intellectual property rights of
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON FEATURED WHITEPAPERS


ADS BY GOOGLE
BREAKING XML NEWS
Woodstream Selects EXTOL Business Integrator to Improve Business Processes, Customer Collaboration and Internal Integration
Woodstream, providers of pet, lawn-care and animal-friendly brands such as Perky-Pet,