Industrial IoT Authors: Pat Romanski, William Schmarzo, Elizabeth White, Stackify Blog, Yeshim Deniz

Related Topics: Industrial IoT

Industrial IoT: Article

Managing and Documenting Your Project XML Style

Managing and Documenting Your Project XML Style

XML seems to be popping up everywhere. In this article, I'm going to touch on an often overlooked but potentially very powerful use for XML technology: XML for project management and documentation. Thanks to the open source community, there are some marvelous tools available for incorporating XML into your software development processes.

Throughout this article I refer to a project's infrastructure. I use the term infrastructure to refer to things such as a project's directory structure, developer mailing lists, build processes, deployment sites, source code configuration management, and documentation repositories. These are items that all projects deal with, but usually little thought is put into how these things are set up, structured, and communicated. This area is also overlooked when is comes to standardization. Lack of a common approach to these infrastructure items makes it difficult for developers and managers to move across projects. With each project you join, you have to get acquainted with a completely new project infrastructure. This adds an initial project learning curve for even the brightest developers and managers. Even within an organization, a common infrastructure across projects is often missing. Creating lower-level standards, such as coding standards and document templates, tends to be as much cross-project standardization as most organizations accomplish.

Fortunately, with the advent of XML technology and some creative work being accomplished in the open source community, a better way of managing your project's infrastructure is on the horizon. The base concept is the introduction of XML throughout a project's documentation. This can bring some powerful benefits. With XML-style documentation, you can utilize XML processing tools to automatically create Web sites and documents that bring all of your documentation together in an easy-to-access portal with very little manual effort. Placing your project documents in XML format also allows for maximum reusability. For example, an XML-formatted requirements document can be transformed into a PDF for distribution to management, transformed into HTML for quick access by all interested parties, and parsed into individual requirements for input to a test plan generator. The use of XML should eliminate the need to document anything project related more than once. The XML documents become a single source for all future uses of the information.

Taking this a step further, you could create a native XML database (using an open source XML DB such as Xindice) that would be the central repository for all of your project's knowledge. Another scenario that becomes an exciting possibility is the creation of a DTD or schema that represents all of the required documentation for projects within your organization. A process could then scan your documentation to determine missing or incomplete documents. Status reports could be automatically generated and e-mailed with the current status of all documentation. All this can be automated into some pretty slick project management tools.

Now that you can envision the power of XML, let's take a look at what's available today to get you started down this path. A good starting point is to take a look at the DocBook standard, a set of XML DTDs describing how to create documents in XML format. The standard is targeted at the production of technical documents. You can see a simple example of a DocBook document in Listing 1. (All of the code referenced in this article is available for download from www.sys-con.com/xml/sourcec.cfm.) The example contains the XML for a book composed of multiple chapters, sections, and an appendix. There are many additional DocBook defined tags that aren't included in the example. DocBook also defines a DTD for articles and documents. Many open source projects use the DocBook standard for their documentation. The Unix community seems to be the leader in its adoption. DocBook is a good starting point for creating your own organization-specific document types, such as your organization's requirements document, development plan, design document, and other documents. Once in XML, these documents can easily be transformed (with other freely available tools) into whatever format you prefer, including PDF, Word-compatible RTF, HTML, or just plain text. Besides multiformat publishing capability, you can also perform a great deal of processing against your documents that are in an XML format. For example, you could parse use cases out of a design document, and individual requirements out of a requirements document to build your test plan.

Several commercially available XML editors allow you to create DocBook-compliant documents. I've used XMLSPY from Altova, which supports DocBook and even includes the DocBook stylesheets with the product. A less-expensive commercial XML editor that offers a nice XML editing interface is Oxygen XML. Oxygen XML is written in Java and is available on a wide range of platforms, including Windows, Macintosh, and many flavors of Unix. Unfortunately, I haven't found a good open source editor for easily creating DocBook documents. One open source candidate, OpenOffice, will give you partial DocBook support, but does not yet support all of the DocBook tags. I anticipate OpenOffice becoming a very viable solution in future releases. OpenOffice does use XML as its native document format, so at least you can have XML documents by standardizing on OpenOffice as your document editor.

Open Source Projects
The open source community seems to be well ahead of the commercial development industry in their use of XML as a project management and documentation technology. There are several open source projects dedicated to advancing the use of XML in these areas.

Maven is an Apache project that defines itself as a project management and project comprehension tool. Through the use of XML technology, Maven attempts to create a well-defined project structure, making it easier to disseminate information about a project and to share a common project structure across projects.

Maven is based on the concept of a project object model (POM), which is stored as an XML file. The POM defines things such as the build process, configuration management, unit testing reports, change log documentation, directory layout, source metrics, mailing lists, developer list with role information, project dependency list, article collection, project distribution, and project Web site creation. Listing 2 is a sample POM represented in XML format. This POM contains general project information, a link to a project issue tracking system, links to site and distribution directories, source repository information, project version info, project mailing list details, developer list, project dependency list, and finally, build and unit testing details. Having an XML source for all this information allows it to be published easily and reported in many formats. Figures 1 and 2 show Web site content that was generated by Maven.

Click Figure to Enlarge

In addition to what Maven provides for a single project, it also allows for a great deal of commonality across projects. This is a significant advantage for organizations with more than a single project. Developers can easily transition across projects and instantly be familiar with the project's infrastructure. Common tools can be used to process all of an organization's projects. Project-wide sites and reports can be automatically generated.

Maven goes beyond just processing your documentation. Maven is also a project processing tool. It can perform all of your build tasks such as compiling your source files, generating javadoc, building your distributable components, and deploying your project. It will also run your unit tests, check the format of your source code against a defined standard, and create XML-based reports for all of its actions. These reports then become a part of your project Web site that Maven creates. Maven has a plug-in based architecture, which allows you to extend Maven in all sorts of ways. There is a good repository of Maven plug-ins available on the Sourceforge open source portal. The best way to get started is to look at an existing project that uses Maven, and tailor the XML files to your project. I would suggest looking at the Apache projects as models for what Maven can accomplish.

Another Apache project that fits into the XML project documentation and processing tool set is the Forrest project. Forrest is defined as an XML standards-oriented project documentation framework. Forrest uses XSLT stylesheets, schemas, images, and other resources you define to render a project's XML source content into an HTML-based project Web site. The project Web site can be generated by Forrest through a user-initiated process, or by using the Forrest Robot, which allows for automatic regeneration of the project site any time the XML source documents are changed. The project Web site created by Forrest provides access to project documentation, source code repositories, mailing lists, contact info, FAQs, how-tos, change logs, and more. Figures 3 and 4 show Web sites that were generated by Forrest. The content of these sites is stored as XML documents that are processed by Forrest. With this approach your project Web site is always current. The HTML content for the Web site does not become another document your team has to write.

Click Figure to Enlarge

Forrest uses the popular Apache Cocoon XML publishing framework to accomplish much of its work. In addition to the HTML Web site, Forrest can also render the project site contents into PDF format. You'll notice the PDF links shown on the sample Web sites. Any Web page that has been generated by Forrest can easily be retrieved as a PDF using these links. To use Forrest, your underlying project documentation and infrastructure need to be in XML format. As with Maven, there is also great benefit in using Forrest across multiple projects. Using Forrest across multiple projects will ensure a consistent look-and-feel for the project sites. You can customize the look-and-feel of the project sites for your organization through the use of skins. As I write this, there is an effort under way to create a Forrest plug-in for Maven. Presumably, this would allow Forrest to parse the Maven project object model and build the project Web site based on that, which would seem to be the ideal use of Forrest and Maven. Maven manages the project structure and performs project build processing, while Forrest renders the project Web site and documentation.

Implementing the Approach
In addition to the tools they are creating, Apache is also on the forefront in implementing this XML approach for managing and documenting their projects. As you browse the Apache projects, you will notice a common look-and-feel to many of the Web sites. This is a result of the application of the Forrest and Maven technologies. Using this approach within open source communities also has the particularly strong benefit of allowing developers to contribute to multiple projects without having to learn a new infrastructure for each project.

Another interesting use of XML technology at the project level within Apache is the Jakarta Gump project. Gump reads XML project descriptor files for each of the Jakarta subprojects, analyzes their dependencies, and performs a nightly build of all the Jakarta projects. All Apache projects are built using the most up-to-date builds from other Apache projects they may depend on. XML-based reports are created based on this nightly process. On the Gump Web site there is a page that gives you the results of these nightly builds. You can also view reports on all the project dependencies, which allows you to find the dependencies of any particular Jakarta project. This capability goes beyond Apache to include other open source projects that the Jakarta projects depend upon. This process allows each of the Jakarta projects to view the impact of their changes on other projects and quickly make the necessary fixes.

Those with the time and desire to create their own project tools of this sort should take a hard look at the Apache Cocoon project as a base to build from. Cocoon is a mature XML publishing framework. Cocoon can interact with file systems, relational or native XML databases, LDAP, and network-based data sources and publish your data in many different formats, including HTML, WML, PDF, SVG, RTF, and more. Another very good tool to get you started is the AurigaDoc tool. This is also an open source tool that can be used to process your XML documents into many different formats, including HTML, DHTML, RTF, PDF, PostScript, Java Help, and HTML Help.

A reason why the open source community has led the way in this area is the lack of commercial tool support. I have not come across a commercial tool that comprehensively supports this type of project management and documentation approach. Those interested in pursuing this today must be willing to piece together the various tools necessary to make this all work. In the future, ideally the fact that you are using Maven, DocBook, or Forrest becomes transparent to the end user, who simply gains the benefits of an XML project infrastructure. In the spirit of practicing what I preach, you can find this article in DocBook format on the XML-J Web site (www.sys-con.com/xml/sourcec.cfm) and on my personal Web site (www.timothyfisher.com). Having created this article in XML, I've been able to generate a PDF for publication and download, and HTML for presentation on the Web from a single source document.


  • Apache Maven: http://maven.apache.org
  • Maven Plug-ins: http://maven-plugins.sourceforge.net
  • Apache Forrest: http://xml.apache.org/forrest
  • Jakarta Gump: http://jakarta.apache.org/gump
  • DocBook: www.docbook.org
  • Apache Cocoon: http://cocoon.apache.org
  • AurigaDoc: http://aurigadoc.sourceforge.net
  • Oxygen: www.oxygenxml.com
  • Author's Site: www.timothyfisher.com
  • More Stories By Timothy Fisher

    Timothy Fisher has recognized expertise in the areas of Java, Ruby, Rails, Social Media, Web 2.0, and Enterprise 2.o. He has served in technical leadership and senior architecture roles with companies such as Motorola, Cyclone Commerce, and Compuware. He is the author of the Java Phrasebook, and the Ruby on Rails Bible. Currently he is employed as a senior web architect with Compuware in Detroit, Michigan.

    Follow Timothy on Twitter at http://twitter.com/tfisher

    Comments (1)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

    IoT & Smart Cities Stories
    There are many examples of disruption in consumer space – Uber disrupting the cab industry, Airbnb disrupting the hospitality industry and so on; but have you wondered who is disrupting support and operations? AISERA helps make businesses and customers successful by offering consumer-like user experience for support and operations. We have built the world’s first AI-driven IT / HR / Cloud / Customer Support and Operations solution.
    Codete accelerates their clients growth through technological expertise and experience. Codite team works with organizations to meet the challenges that digitalization presents. Their clients include digital start-ups as well as established enterprises in the IT industry. To stay competitive in a highly innovative IT industry, strong R&D departments and bold spin-off initiatives is a must. Codete Data Science and Software Architects teams help corporate clients to stay up to date with the mod...
    At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
    Druva is the global leader in Cloud Data Protection and Management, delivering the industry's first data management-as-a-service solution that aggregates data from endpoints, servers and cloud applications and leverages the public cloud to offer a single pane of glass to enable data protection, governance and intelligence-dramatically increasing the availability and visibility of business critical information, while reducing the risk, cost and complexity of managing and protecting it. Druva's...
    BMC has unmatched experience in IT management, supporting 92 of the Forbes Global 100, and earning recognition as an ITSM Gartner Magic Quadrant Leader for five years running. Our solutions offer speed, agility, and efficiency to tackle business challenges in the areas of service management, automation, operations, and the mainframe.
    The Jevons Paradox suggests that when technological advances increase efficiency of a resource, it results in an overall increase in consumption. Writing on the increased use of coal as a result of technological improvements, 19th-century economist William Stanley Jevons found that these improvements led to the development of new ways to utilize coal. In his session at 19th Cloud Expo, Mark Thiele, Chief Strategy Officer for Apcera, compared the Jevons Paradox to modern-day enterprise IT, examin...
    With 10 simultaneous tracks, keynotes, general sessions and targeted breakout classes, @CloudEXPO and DXWorldEXPO are two of the most important technology events of the year. Since its launch over eight years ago, @CloudEXPO and DXWorldEXPO have presented a rock star faculty as well as showcased hundreds of sponsors and exhibitors! In this blog post, we provide 7 tips on how, as part of our world-class faculty, you can deliver one of the most popular sessions at our events. But before reading...
    DSR is a supplier of project management, consultancy services and IT solutions that increase effectiveness of a company's operations in the production sector. The company combines in-depth knowledge of international companies with expert knowledge utilising IT tools that support manufacturing and distribution processes. DSR ensures optimization and integration of internal processes which is necessary for companies to grow rapidly. The rapid growth is possible thanks, to specialized services an...
    At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
    Scala Hosting is trusted by 50 000 customers from 120 countries and hosting 700 000+ websites. The company has local presence in the United States and Europe and runs an internal R&D department which focuses on changing the status quo in the web hosting industry. Imagine every website owner running their online business on a fully managed cloud VPS platform at an affordable price that's very close to the price of shared hosting. The efforts of the R&D department in the last 3 years made that pos...