|
|
YOUR FEEDBACK
Did you read today's front page stories & breaking news?
SYS-CON.TV |
TODAY'S TOP SOA & WEBSERVICES LINKS XML Protocols
RELAX NG: The Power Is in the Patterns
By: Tom Gaven
Digg This!
Schema languages are languages that allow you to specify the structure of XML instance documents. RELAX NG (see www.relaxng.org) is an XML schema language that is considered to be simple, yet powerful. This article gives an overview of an important concept of the RELAX NG schema language called patterns. The power of RELAX NG can be found in its patterns. Schema languages also describe the allowed names of elements and attributes that are found in XML instance documents. And they allow you to specify element ordering, occurrence, and allowed content, like simple text, or datatypes, like integers. Some examples of schema languages are W3C XML Schema, RELAX NG, Schematron, and DTD. RELAX NG differs from other schema languages in that it's built around the concept of patterns. To understand the power of RELAX NG, you must first understand the basic RELAX NG patterns and how they can be combined. Let's begin by taking a look at the following XML instance document: <document pages="1" year="2002" > <title>Patterns are fun <author>Tom Gaven</author> <para> This is the <b>first</b> paragraph. </para> <para> And this is the <b>second</b>. </para> <prints>5 60 45</prints> </document>This XML instance document contains elements and text. To give you an example of the RELAX NG patterns found in this XML instance document, we could say (ignoring attributes) that the pattern for the document element would be element title, followed by element author, followed by one or more para elements. The pattern for the para element would be a mixture of text and element b. The title, author, and b elements all have a similar pattern, text. Listing 1 is a complete RELAX NG schema document that shows this pattern syntax. The syntax used in the listing is the compact syntax for RELAX NG. This schema can be used to validate the instance document. Line 1 is a top-level declaration that specifies we're using XML Schema datatypes tied to the "xsd" prefix. The keyword "start" specifies the root element of the instance document. (Datatypes, start, element, attribute, text, and list are all keywords.) RELAX NG's compact syntax uses the same occurrence symbols as DTDs (+,?,*). The RELAX NG patterns are found to the right of the equals (=) sign. Definition names are to the left. Note that patterns can be combined, like the pattern "pages, year, title, author, para+". The comma specifies an ordering of the patterns. You can also group (or nest) patterns using parentheses. The ( text | b )* pattern specifies zero or more sets of text or b elements.
Patterns
P1. element nameClass { pattern } These two patterns are recursive in that they allow other nested patterns. A nameClass is another great feature of RELAX NG, but for now just think of the nameClass as simply the name of the element or attribute. In the schema document lines 3-10 all use the element or attribute pattern.
Datatype patterns
Pattern P3 is used on lines 4 and 10 in the schema document. It allows you to specify the datatype allowed for the content of an element or attribute. Pattern P4 is used on line 5. The text "2002" is the DatatypeValue field. This specifies that the year element must have the content "2002". Pattern P5 is the text pattern. The keyword "text" specifies that text content is allowed.
Ref (Reference) patterns
Pattern P6 allows patterns to reference definitions, which in turn define other patterns. The ref pattern is used in the schema document in Listing 1 in lines 2, 3, and 8. For example, the "b" reference on the right side of line 8 references the definition (b) on the left side of line 9. Pattern P7 allows patterns to be nested inside parentheses (see line 8 in the schema).
List pattern
The list pattern allows lists of content (separated by white space) to be specified. Line 10 in the schema specifies that the prints element can contain a "list of integers."
More patterns
Power in Combinations
Combined patterns
Element x contains either a list of integers or an element b.
x = element x { "1.0" | "NaN" | num } Element x contains either the string "1.0", the string "NaN", or a num element.
x = element x { (a,b)|(a,c) } Element x contains (element a and attribute b) OR (element a and attribute c). RELAX NG patterns are quite powerful, yet easy to learn and use. With their ability to combine, they offer capability not found in either DTDs or XML Schema. XML JOURNAL LATEST STORIES . . .
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
|
SYS-CON FEATURED WHITEPAPERS MOST READ THIS WEEK BREAKING XML NEWS
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||