Welcome!

Industrial IoT Authors: Liz McMillan, Pat Romanski, Ravi Rajamiyer, Stackify Blog, Yeshim Deniz

Related Topics: Industrial IoT

Industrial IoT: Article

DTD Development Driving You Delirious?

DTD Development Driving You Delirious?

No, the abbreviation DTD is not etymologically related to a similar abbreviation from medical science, namely, DTs (or delirium tremens), a violent delirium with tremors, which is induced by the prolonged use of alcohol. Though in absorbing the intricacies of DTDs and trying to develop your first one, you may begin to wonder whether the two terms are somehow connected.

Even if you've mastered the basic syntax of XML, writing your first document type definition can be brow-ruffling, not in the least because DTD syntax is different from XML. This tutorial aims to ease you into DTD development mode without, I hope, giving you nightmares, let alone the DTs or anything else of the sort.

That said, you will no doubt need to read and study the cited references - and create a DTD or two on your own - before fully understanding the many nuances of DTDs. To launch you on your way to a complete understanding of DTDs, this month's and next month's columns will explain what document type definitions do and how they do it by guiding you step-by-step through the development of one for a hypothetical résumé.

This tutorial, however, is by no means a complete introduction to DTDs; instead, it seeks to get you started, to familiarize you with DTDs, and to point you in the right direction. After reading and studying this column and the resources to which I refer you, you should be ready to create DTDs for your own XML publishing projects.

Constraining Data with Rules
A document type definition lays out the underlying rules that constrain the data in an XML document, much like grammar supplies the tacit rules by which we link words into sentences. To constrain XML data, a DTD combines syntax and operators, all explained below, to form explicit rules that principally do the following:

  • Declare the elements that may appear in a document
  • Describe the legal content of an element
  • Specify the permissable children, if any, of each element
  • Set the order in which elements must appear
  • Generalize about the number of times each element may occur
  • Specify the kind of data a terminal element may contain
  • List the attributes that an element may take
  • Indicate whether an attribute is required or optional
  • Provide an easy way to substitute some characters for others

    Each of these tasks maps to an aspect of DTD syntax or to a DTD operator, which together form the building blocks of a DTD. In my next column, I will address attributes, entities, and some of the more advanced aspects of DTDs. This month, we'll look at the element-specific tasks and bring them to bear on a hypothetical résumé.

    Preconstruction Analysis
    Once you've analyzed a representative subset of the documents in your document set, diced them up into a suitable level of structured detail, defined your markup strategy as well as your tags and attributes, and applied them to several documents in your set (see my previous tutorial [XML-J, Vol. 2, issue 6] for more information), you're ready to begin creating your DTD, which will no doubt be an iterative process.

    You can create a DTD in one of two ways: (1) write it from scratch, by hand, as I later demonstrate, or, (2) you can make use of a DTD-creation tool. XML Authority (free trial version available from TIBCO Extensibility at www.extensibility.com/solutions/trial.htm) simplifies DTD construction. And it checks for errors, too, a few of which you'll probably make in your first DTD. It also has a help section on best practices.

    Declaring Elements and Their Contents
    Let's say we want all the résumés in a set of them to contain the root element résumé. To declare an element, you use the <!ELEMENT> declaration. Thus, to declare the résumé element in the DTD, write:

    <!ELEMENT resume ANY>

    This statement declares that the element résumé may appear with, as the DTD keyword ANY indicates, any combination of text and child nodes. The syntax of the <!ELEMENT> declaration is this:

    <!ELEMENT elementName {rule}>

    where {rule} may be replaced either by a DTD keyword in all capitals like ANY or EMPTY, by a parentheses-enclosed rule with one or more child elements separated either by commas indicating that their appearance is required in the order specified, or by a vertical bar, indicating choice. Don't panic; I'll show you examples of all this later.

    In our example with the résumé element above, the syntactic slot for the rule filled by the keyword ANY is used to describe the legal content - a content model - of the résumé element. In this case it can be anything from child elements to text.

    However, using the ANY does little to constrain our data, defying the purpose of our DTD, which is to make explicit the permissible relationships and associations among the data in a set of XML documents. Thus, in the rule slot where I've used the keyword ANY, better constraints can and should be put in place. Instead of simply saying that the resume element can take any combination of child elements and text, let's state exactly what children it may contain.

    Specifying Children
    After analyzing and marking up a couple of résumés in the document set, I've settled on a firm, high-level structure for the content for all the résumés I'm collecting for publication on a Web site. The high-level structure of each resume will be the same and must conform to a set pattern: every resume will have a root element called résumé that will contain the following child elements in the following order: name, contactInfo, experience, and education. Thus, in an XML document, I will structure the résumé's high-level data like this:

    <resume>
    <name>...</name>
    <contactInfo>...</contactInfo>
    <experience>...</experience>
    <education>...</education>
    </resume>

    To make this grammar explicit, I specify the child elements and their ordering with the following multiple sequence:

    <!ELEMENT resume (name, contactInfo, experience, education) >

    This DTD rule says that the résumé element contains the children enclosed in parentheses. The commas in the rule stipulate that the elements must appear in the order in which they are listed. No other child elements or text are permitted directly under the résumé node. Such a rule, when related to the XML document in which the elements occur, is called a content model.

    Occurrence Operators
    A set of DTD operators, called occurrence operators, allows authors to generalize about the optionality and frequency with which elements may occur. If no occurrence operator is used, the default is exactly one occurrence. Table 1 summarizes the occurrence operators and the rule each one implements.

    Now let's say we want the <experience> element in our resume to have the following structure:

    <experience>
    <position></position>
    <company></company>
    <location></location>
    <task></task>
    <note></note>
    </experience>

    <position> and <company> are both mandatory elements and must appear in each résumé in our set once and only once; <location>, however, may appear once or not at all but not multiple times. Meantime, <task> must appear at least once but may be reused as often as needed, while <note> is a fully optional element to add any additional information that isn't captured in the other elements: it may be used once, more than once, or not at all. To instill these rules about occurrence into your content model for the element <experience>, append the occurrence operator as a suffix to the appropriate element, as the following example shows:

    <!ELEMENT experience (position, company, location?, task+, note*) >

    Since neither <position> nor <company> are followed by an occurrence operator, they receive the default setting. They must appear in the XML document exactly one time.

    Nesting with Parentheses
    You can also use parentheses to nest elements within a rule and then apply any of the occurrence operators to all the elements within the nested set. For example:

    <!ELEMENT location ((street, suite)?, city, (state, zip)?)>

    This rule says that the (street, suite) sequence is optional but may not be repeated (note, though, that if the street element is used, so must the suite element). Ditto with the (state, zip) sequence. The city element, however, must appear exactly once, and if the other elements are used, they must be positioned according to the order dictated by the comma-separated list.

    By the way, you can also nest a choice of elements by placing them in parentheses and separating them by vertical bars instead of commas. For instance, the following rule says that the name element must contain a first name, optionally followed either by a choice of a middle name, a middle initial, or by a nickname, followed by a required last name.

    <!ELEMENT name (firstName, (
    (middleName | middleInitial)? |
    (nickName)? )*, lastName)>

    Yes, this can get complex quickly. Since I don't want to bore you to death with needless complications, I'll stop with all this nesting balderdash and instead point you to a reference that explains in more detail how to use parentheses to form complex rules: Chapter 3, "Document Type Definitions," in XML in a Nutshell, by Elliotte Rusty Harold and W. Scott Means (O'Reilly).

    Mixing Content
    The guts of many narrative-oriented XML documents, such as software manuals, essays, or magazine articles, use mixed content - a choice of either text or elements or a combination of both. For instance, our sample résumé has an as-yet undefined element called <task>, the contents of which aim to describe what the résumé's author did in any given job. Here's an example of the kind of description that would go inside our <task> tag, with additional markup:

    <task>Served as consultant for editing, electronic production, and desktop publishing of one financial newsletter, called <cite>Securities Today</cite>, and the organization, design, and launch of two others. <paragraph>All three became <emphasis>highly successful</emphasis>publications.</paragraph></task>

    Within the task element, then, the résumé's author could have any combination of text and the <cite>, <paragraph>, and <emphasis> tags. Though not used in the task description, let's also make a line-break element, named <br> as in HTML, available.

    To declare a rule that allows mixed content like this, you must first declare the text and then list the other elements, all separated by vertical bars to indicate choice and marked as optional and repeatable with the asterisk occurrence operator. Here's the rule:

    <!ELEMENT task (#PCDATA | paragraph | cite | emphasis | br )*>

    #PCDATA? What in the world is this? PCDATA is XML's name for standard text (though I'm oversimplifying a bit). PCDATA stands for parsed character data, which includes regular text characters, except <, &, or the sequence ]]>. PCDATA also includes general entities, which I'll discuss in my next column.

    Because a DTD must declare the content model for each element used in the XML document, our DTD must include declarations for the paragraph, cite, emphasis, and <br> elements. With the exception of the <paragraph> and <br> elements, they contain only PCDATA, which, used alone, excludes other element tags. Thus:

    <!ELEMENT paragraph (#PCDATA | cite | emphasis)>
    <!ELEMENT cite (#PCDATA )>
    <!ELEMENT emphasis (#PCDATA )>

    Finally, empty elements, like my HTML-like <br> element in the rule for <task> above, may be declared using the EMPTY keyword:

    <!ELEMENT br EMPTY>.

    This ensures that no content - whether other elements or parsed character data - may be placed within it.

    Declaring mixed content can get tricky. The key is to remember that if it includes PCDATA, then PCDATA must be declared first. And all the items in the rule must be separated by the vertical bar indicating choice and the rule must be marked as optional and repeatable with an asterisk.

    For more of the nasty little details about mixed content, see the chapter in XML in a Nutshell on DTDs. If you want to dive straight into a full course on DTDs, David Megginson's book, Structuring XML Documents (Prentice Hall), explains all the nuances of building industrial-strength DTDs. For a concise introduction to DTDs, see Robert Eckstein's XML Pocket Reference (O'Reilly); I suggest you read its short but potent section on DTDs several times.

    My next column will offer the second installment in DTD construction, picking up where this one left off. It'll cover attributes, entities, and the more advanced aspects of working with DTDs. Meantime, begin writing a DTD for your résumé by declaring its elements and the rules associated with them.

    Don't worry about attributes just yet. Remember, though, to declare the content for every element, as I've done in the preceding examples for the child elements of <task> (but did not do yet for many of the other elements). In my next tutorial, we'll combine all the elements and attributes into a complete DTD.

  • More Stories By Steve Hoenisch

    Steve Hoenisch is a technical writer (consultant) with Verizon
    Wireless. Before becoming a technical writer and a Web developer, he
    worked as a journalist and teacher. Steve has been developing Web
    sites since 1996.

    Comments (1) View Comments

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


    Most Recent Comments
    Steve Lareau 08/06/01 08:44:00 AM EDT

    Hi Steve,
    I loved your first article of the xml tutorial column. i tried the resume project and I don't think it's working properly. I'm on a packard bell pentium 2 running windows 98. I have internet explorer 5 and PWS. I coppied listing 1 from the first article in notepad and saved it with a .xml extension. I opened it with IE and all the text was bold black print while all the tags were light brown or blue I think. Any way it didn't just display the text output that I expected. I debugged it and nothing happened. What am i doing wrong. Also I just got the Vol.2 issue 7, but I need Issue 6, I thought that issue would still be on the news stands until the end of August. I am now missing that Issue. Can you help me? Thanks Steve. Great column!!!

    Steve Lareau
    [email protected]

    31 Berwick Road
    Ogunquit
    Maine
    03903

    [email protected]

    @ThingsExpo Stories
    SYS-CON Events announced today that Interface Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Interface Corporation is a company developing, manufacturing and marketing high quality and wide variety of industrial computers and interface modules such as PCIs and PCI express. For more information, visit http://www.i...
    SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.
    In his session at @ThingsExpo, Greg Gorman is the Director, IoT Developer Ecosystem, Watson IoT, will provide a short tutorial on Node-RED, a Node.js-based programming tool for wiring together hardware devices, APIs and online services in new and interesting ways. It provides a browser-based editor that makes it easy to wire together flows using a wide range of nodes in the palette that can be deployed to its runtime in a single-click. There is a large library of contributed nodes that help so...
    What is the best strategy for selecting the right offshore company for your business? In his session at 21st Cloud Expo, Alan Winters, U.S. Head of Business Development at MobiDev, will discuss the things to look for - positive and negative - in evaluating your options. He will also discuss how to maximize productivity with your offshore developers. Before you start your search, clearly understand your business needs and how that impacts software choices.
    SYS-CON Events announced today that Mobile Create USA will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Mobile Create USA Inc. is an MVNO-based business model that uses portable communication devices and cellular-based infrastructure in the development, sales, operation and mobile communications systems incorporating GPS capabi...
    While some developers care passionately about how data centers and clouds are architected, for most, it is only the end result that matters. To the majority of companies, technology exists to solve a business problem, and only delivers value when it is solving that problem. 2017 brings the mainstream adoption of containers for production workloads. In his session at 21st Cloud Expo, Ben McCormack, VP of Operations at Evernote, will discuss how data centers of the future will be managed, how th...
    There is huge complexity in implementing a successful digital business that requires efficient on-premise and cloud back-end infrastructure, IT and Internet of Things (IoT) data, analytics, Machine Learning, Artificial Intelligence (AI) and Digital Applications. In the data center alone, there are physical and virtual infrastructures, multiple operating systems, multiple applications and new and emerging business and technological paradigms such as cloud computing and XaaS. And then there are pe...
    SYS-CON Events announced today that MIRAI Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MIRAI Inc. are IT consultants from the public sector whose mission is to solve social issues by technology and innovation and to create a meaningful future for people.
    SYS-CON Events announced today that Keisoku Research Consultant Co. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Keisoku Research Consultant, Co. offers research and consulting in a wide range of civil engineering-related fields from information construction to preservation of cultural properties. For more information, vi...
    SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of...
    SYS-CON Events announced today that Enroute Lab will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Enroute Lab is an industrial design, research and development company of unmanned robotic vehicle system. For more information, please visit http://elab.co.jp/.
    SYS-CON Events announced today that Ryobi Systems will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ryobi Systems Co., Ltd., as an information service company, specialized in business support for local governments and medical industry. We are challenging to achive the precision farming with AI. For more information, visit http:...
    Real IoT production deployments running at scale are collecting sensor data from hundreds / thousands / millions of devices. The goal is to take business-critical actions on the real-time data and find insights from stored datasets. In his session at @ThingsExpo, John Walicki, Watson IoT Developer Advocate at IBM Cloud, will provide a fast-paced developer journey that follows the IoT sensor data from generation, to edge gateway, to edge analytics, to encryption, to the IBM Bluemix cloud, to Wa...
    SYS-CON Events announced today that Fusic will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Fusic Co. provides mocks as virtual IoT devices. You can customize mocks, and get any amount of data at any time in your test. For more information, visit https://fusic.co.jp/english/.
    SYS-CON Events announced today that B2Cloud will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. B2Cloud specializes in IoT devices for preventive and predictive maintenance in any kind of equipment retrieving data like Energy consumption, working time, temperature, humidity, pressure, etc.
    SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp em...
    Elon Musk is among the notable industry figures who worries about the power of AI to destroy rather than help society. Mark Zuckerberg, on the other hand, embraces all that is going on. AI is most powerful when deployed across the vast networks being built for Internets of Things in the manufacturing, transportation and logistics, retail, healthcare, government and other sectors. Is AI transforming IoT for the good or the bad? Do we need to worry about its potential destructive power? Or will we...
    SYS-CON Events announced today that SIGMA Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. uLaser flow inspection device from the Japanese top share to Global Standard! Then, make the best use of data to flip to next page. For more information, visit http://www.sigma-k.co.jp/en/.
    Agile has finally jumped the technology shark, expanding outside the software world. Enterprises are now increasingly adopting Agile practices across their organizations in order to successfully navigate the disruptive waters that threaten to drown them. In our quest for establishing change as a core competency in our organizations, this business-centric notion of Agile is an essential component of Agile Digital Transformation. In the years since the publication of the Agile Manifesto, the conn...
    SYS-CON Events announced today that Nihon Micron will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Nihon Micron Co., Ltd. strives for technological innovation to establish high-density, high-precision processing technology for providing printed circuit board and metal mount RFID tags used for communication devices. For more inf...