Welcome!

XML Authors: Data Recovery Services, Don Nelson, Jeff Scholes, Corey Roth, John Savageau

Related Topics: XML

XML: Article

Building Distributed Applications with Corba and XML

Building Distributed Applications with Corba and XML

XML and CORBA are key technologies for building distributed systems, but both have evolved separately by addressing different needs of content management and distributed applications. This article illustrates how these technologies can benefit from each other.

XML
XML and related specifications are addressed on the W3C Web site (www.w3c.org) and in other articles in this journal. This section looks at the structure of an XML application as depicted in Figure 1. The entity manager reads XML documents from some virtual storage - a database, a file or some other mechanism - and creates XML entities, a term used to describe XML elements. This data is passed to an XML parser, which can optionally validate the document using either the DTD or, in the future, one of the semantic specifications (e.g., DCD, RDF). The application code accesses this data using the DOM or similar API mapping to some programming language to process the data. Since XML today is still about syntax, this code contains the logic needed by the semantics to make meaningful use of the data. As seen in Figure 1, XML doesn't have transport built into it. Availability of a transport facility is thus a requirement for building a distributed XML application.

CORBA
The common object request broker architecture (CORBA) is a specification by the Object Management Group (OMG). Figure 2 depicts the structure of a distributed application built using CORBA. Objects are described using the Interface Definition Language (IDL) and are distributed on a communication bus, the Object Request Broker (ORB). These objects are then implemented in a programming language, typically Java or C++, and the ORB transparently handles all network, platform and language issues. Additionally, the objects can be made transactional, secure, and so forth, using CORBA services. Other vendor-provided tools help in the management and administration of these objects, enabling the development of a large-scale distributed system. (Thousands of such CORBA applications have been deployed all over the world.)

A distributed application consists of three logical components - data, logic and transport. CORBA ties these three components together in the form of distributed objects in which the logic is implemented using some OO language. The data is tied to the transport and transferred in binary form between the client and the server. The familiar function call paradigm is used to invoke methods on remote objects transparently. In contrast, XML is only about data, which is represented as text and not inherently tied to any transport protocol. It can be carried over any transport protocol that's capable of transporting textual data. The XML programming model consists of walking a parsed XML document and taking action based on the elements encountered. Using XML for data gives us the ability to describe arbitrarily complex data types, build loosely coupled systems, easily transform data from one form to the other and easily display data in browsers. XML parsers are available for most languages and operating systems; thus it's easy to embed a parser with an application. XML support is also present in the popular browsers, giving developers the ability to display XML data. CORBA has been used successfully to interface with legacy and ERP systems. Using XML instead of IDL types may, in certain cases, make this integration even easier. Many ERP vendors have pledged XML support in their products, which would make the use of XML even more attractive.

A distributed system that uses CORBA for distribution and XML for data gives us the best of both worlds, and at the same time builds on two open industry standards. The following section looks at the specifics of how this integration can be achieved.

Integrating XML and CORBA
There's some commonality in the concepts of XML and CORBA, as illustrated in Table 1.

XML documents, which are textual in nature, define two popular programming APIs:

  • The Document Object Model (DOM) API defined by the W3C: This API provides access to the XML document independent of the vocabulary of the document, that is, the API methods are generic document methods. It creates a full document object from an XML document, and thus may require substantial space if you're dealing with large documents.
  • The Simple API for XML (SAX) API: This is an event-driven API and only the sections of the document that are of interest are loaded and presented for use. As both APIs are generic in nature, an alternative approach would be to provide an API in which the vocabulary of the XML document is mapped directly in the API. For example, the document is accessed directly using the element and attribute names that appear in the document. This way, the semantics of the XML document are visible to the code that manipulates it, resulting in greater legibility.

    The interfaces of CORBA objects and the types of data they can manipulate are defined using IDL. Thus the integration of XML with CORBA must be expressed using IDL types. However, it's possible to perform this transformation dynamically when IIOP-compatible structures can be created directly from XML documents.

    IDL provides standard primitive data types along with structured and aggregate data types and variable types in the form of the CORBA Any (and DynAny). IDL distinguishes between object types and value types as noted in the sidebar below.

    It's not desirable to produce an XML IDL API using CORBA object types. This would imply that the XML document is encapsulated within a CORBA interface, which would be inefficient since the document is accessed in a remote fashion.

    It's more desirable to use either value types or existing IDL data types as a means of passing documents by value, since this would allow efficient operations on the XML document. However, the value-types specification has only recently been ratified by the OMG, and few implementations of the specification are available. Thus the choice remains about whether the integration of XML and CORBA should use value types or map directly to existing CORBA data types.

    Approach 1: Simple Approach
    This initial approach is the simplest one possible for passing an XML document to a CORBA object and involves "stringifying" the XML document.

    struct XMLDocument
    {
    String dtd;
    String document;
    };

    typedef sequence<XMLDocument> XMLDocuments;

    interface MyObject
    {
    void invoke(XMLDocuments documents);
    };

    One advantage of this approach is that it's easy to extend existing object interfaces to support XML data. The disadvantages are significant and are as follows:

  • Inefficient use of space: In a distributed application the verbose XML document is distributed over the network.
  • Inefficient use of time: The XML document must be parsed at each point of use.
  • Not type-safe: There's no way of validating the XML document at transmission, and it must be validated at each point of use.

    Approach 2: Mapping XML to IDL Using Value Types
    This approach uses CORBA value types and is under review by the OMG. It represents as much of the DOM API as possible using CORBA value types. The value type, a new IDL construct, is a cross between the existing struct and interface. It's always passed by reference, but can contain method declarations such as an interface.

    Listing 1, taken from OMG Document 99-12-05, demonstrates the value-type definition for a DOM Node. The Node type is central to the DOM API, since the document, element and text sections of an XML document implement the Node interface. As you can see, this API uses a standard naming convention for accessing all parts of an XML document.

    Other Key DOM types are DOMString, NodeList, Document, Element, Attr and Text. This approach is consistent with the standard DOM API; however, the semantics of the XML document aren't clear when using a standard API. It's also likely to be some time before such an approach is practical, as the CORBA implementation of the value-type specification hasn't become common yet.

    Approach 3: Mapping XML to CORBA IDL
    This approach takes advantage of the IDL types currently supported by existing CORBA implementations. It maps an XML document into CORBA types before an operation is invoked, and converts the CORBA types back to XML when they're received by the receiving object. This approach is network efficient and type-safe. Another advantage - the semantics of the document are preserved when translated directly into the CORBA structures used to encapsulate the document.

    The examples that follow demonstrate the mapping between an XML document and its associated CORBA IDL definition. This mapping could be performed automatically given a DTD (or XML schema) for the XML document in question.

    XML documents are implemented as a hierarchical CORBA type using a two-way mapping between XML and IDL and the CORBA Any. In this case, the CORBA Any will always hold a CORBA structure, and the contents of this structure depend on the XML document. The mapping is detailed as follows:

    Mapping XML ---> IDL

  • An XML element maps to a CORBA struct and an appropriate primitive type is used for the element value:

    <element>
    1.23
    </element>

    struct element
    {
    float _Value;
    };

  • An XML attribute maps to a CORBA string member of the struct:

    <element color=green' font=fixed'/>
    struct element
    {

    string color,
    string font
    };

    <element color=green'>
    1.23
    </element>
    struct element
    {
    string color,
    float _Value;
    };

  • A single child XML element maps to a struct within the parent struct:

    <element>
    <child id=abc'/>
    </element>

    struct element
    {
    struct child
    {
    string id;
    };
    };

  • A variable number of XML child elements of the same name map to a sequence of structs within a struct:

    <elements>
    <element color=green'/>
    <element color=blue'/>
    </elements>

    struct elements
    {
    struct element
    {
    string color;
    };
    sequence<element> _element;
    };

  • A fixed number of child elements of the same name map to an array of structs within a struct:

    <elements>
    <element color=green'/>
    <element color=blue'/>
    </elements>
    struct element
    {
    string color;
    };
    struct elements
    {
    element _element[2];
    };

  • A CORBA Type Code is produced to describe the newly defined structure and forms part of the resulting CORBA Any.
  • An object that uses an XML document can be defined in IDL as follows:

    interface MyObject
    {
    void invoke(Any document);
    };

    Mapping IDL ---> XML

  • Only struct, sequences, arrays and base-types are mapped.
  • An XML Schema or DTD is produced from the CORBA Type Code.
  • A structs member maps to all attributes except "_Value", which maps to the text value for element.
  • Sequences and arrays map to multiple elements.

    This type-safe and efficient approach takes advantage of the available CORBA implementations. An example of mapping XML to CORBA taken from the finance domain demonstrates the mapping of a stock portfolio XML document to a CORBA IDL (see Listing 2).

    Summary
    In this article we discussed the benefits of using CORBA and XML for building a distributed application and three approaches to integrating XML documents with CORBA. While it's possible to achieve integration using a primitive approach - converting the XML document to a string and passing this string to the object - little efficiency is achieved this way and the power of CORBA isn't leveraged.

    The second approach, which attempts to leverage from the standard DOM API to XML, presents a solution using the newly defined CORBA value-type specification. While this solution is efficient and type-safe, it doesn't attempt to map the semantics of the document into the CORBA value type.

    The final approach achieves efficiency and type safety, and succeeds in preserving the semantics of the XML document in the CORBA structure. It's also possible to implement this approach with the currently available CORBA implementations.

    References
    1. W3C Web site: www.w3c.org
    2. OMG Web site: www.omg.org
    3. Hemming, M., and Vinoski, S. (1999). Advanced CORBA
    Programming with C++. Addison Wesley.
    4. Slama, D., Garbis, J., and Russell, P. (1999). Enterprise
    CORBA. Prentice Hall.

  • More Stories By Dermot Russell

    Dermot Russell is a senior XML software architect at Macalla Software (www.macalla.com), a Dublin-based software developer. Dermot has a BS in applied computing and an MS in computer science.

    More Stories By Nick Simha

    Nick Simha is the Western region presales manager for Iona Technologies (www.iona.com). He holds a master's degree in computer science from the University of Missouri.

    Comments (0)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.