Industrial IoT Authors: Pat Romanski, William Schmarzo, Elizabeth White, Stackify Blog, Yeshim Deniz

Related Topics: Industrial IoT, Agile Computing

Industrial IoT: Article

XML, Ontologies, and the Semantic Web

XML, Ontologies, and the Semantic Web

If [computer networking] were a traditional science, Berners-Lee would win a Nobel Prize," Eric Schmidt, CEO of Novell, once commented. Indeed, Tim Berners-Lee revolutionized the world when he created the Web in 1991. Now, he is talking about the second generation of the Web, and his talks are generating buzz...the W3C is establishing standards for it, and universities, companies, and industry consortiums are building the technologies necessary for it. He refers to it as the Semantic Web.

The Semantic Web is envisaged as a place where data can be shared and processed by automated tools as well as by people. The key lies in the automation and integration of processes through machine-readable languages. In order to leverage and link the vast amounts of information available on the Web, software agents must be able to comprehend the information, i.e., the data must be written in machine-readable semantics. For example, whether I use the tag <dead> or the tag <alive> next to a person's name in my XML document makes no difference to the parser. Some additional semantics or metadata must be added in order for a software program to make an intelligent assessment of the state of the person. This metadata, or meaning (versus display), of information is what is known as semantics.

Let's consider an example illustrating the advantages of having semantics that add meaning to information on the Web. Say you live in New York and decide to attend a conference in London. You would have to go to many airline Web sites and look at all flights leaving from New York to London. Then, you would go to various hotel Web sites and look for a hotel near your conference location that has a room available. That's a fair bit of searching. Luckily, you can search for the information on the Web, and in most cases you can pay for everything on the Web.

Now imagine another scenario: you're driving down 5th Avenue in Manhattan. Your secretary calls you on your cell phone and says that you've been invited to be the keynote speaker at a conference in Europe on May 5, 2003. You think that's great, and you begin to make plans for your trip. You flip open your Palm Pilot, which is connected to the Web, and you type in some commands: book flight and return from New York to London, May 5-11; book room in hotel near the conference location, Hilton London Metropole, in London.

Your Palm Pilot has a software program or software agent that understands your commands; it processes the semantics of your command intelligently. Your agent buys your ticket and books a room in a hotel. As you drive into your garage, your Palm Pilot beeps and asks you to confirm the information. You park your car, confirm the bookings, and then go inside. This is just one example of how easy life gets when the Web is an intelligent partner in your universe.

Ontologies for Knowledge Representation
In order for computers to be more helpful, the Semantic Web augments the current Web with formalized knowledge and data that can be processed by computers. To be able to search and process information such as airline flights, software programs need information that has been modeled in a coherent manner. An ontology models all the entities and relationships in a domain.

Continuing with our example, let's create a hypothetical ontology for Virgin Atlantic's flights. An ontology for the airline industry would model its metadata using the following semantics (in italics):

A flight has an origin, destination, flight number,
departure time, arrival time, class {attributes}
A international flight is a type of flight {inheritance}
A flight can have one origin {one-to-one association}
A flight can have many classes {one-to-many association}

In other words, ontology captures the attributes of an entity and inheritance relationships as in object-oriented programming; it also captures associations such as cardinality as in relational databases (see Figure 1).

The specific information or instance of this metadata for a particular flight may be as follows:

Flight Number: VS018
Origin: New York (EWR)
Destination: London (LHR)
Departure Time: 08:20, May 5, 2003
Arrival Time: 20:00, May 5, 2003
Class: Economy

With these semantics, you can type the following commands for your software agent:

flight origin: "New York" destination: "London"
departure: "May 5, 2003" arrival: "May 5, 2003"

Without a standard naming convention for concepts such as destination, your software agent cannot present your commands to Virgin Atlantic's server. In addition, it is important that British Airways' server understands these semantics as well so that you can search for tickets on that airline. When you model the concepts in a domain, such as the airline industry, and publish them, you are in essence creating an ontology.

The Semantic Web Architecture
Now that we've discussed both the vision of the Semantic Web and the necessity of ontologies for knowledge representation, we'll explore the implementation of the model.

There are several important steps in the workflow of the example we discussed above:
1.   Modeling the specifics of a resource such as Virgin Atlantic flight VS018 from New York to London. For this, we will discuss Resource Description Framework (RDF).
2.   Modeling the concepts of the entire airline industry. Here we'll consider Web Ontology Language (OWL) and how to map one ontology to another.
3.   Trusting that the information provided by an airline or a ticket broker is correct. We'll discuss digital signatures as well as an application for a trusted community known as Friend-of-a-Friend (FOAF).
4.   The first three points consider information and its validity, but what about the mechanics of sending commands and receiving results? This involves a discussion of software agents and Semantic Web services, an extension of Web services.

An excellent starting point for any discussion of the architecture of the Semantic Web is Tim Berners-Lee's diagram shown in Figure 2. Discussing the different layers of the diagram will take us through the implementation of our example.

Unicode and Uniform Resource Identifier
The Uniform Resource Identifier (URI) forms the foundation of the Semantic Web. The URI provides a unique identifier for any Web resource, and even for any object outside the Web; for example, a person can have a URI. The Semantic Web names every concept by a URI, thus letting anyone express new concepts with minimal effort and allowing definitions to be qualified by their sources.

Keeping Unicode as a foundation allows for the multiplicity of languages in which information is marked up throughout the globe. Unicode supports multilingual characters in a 40,000 character charset.

XML, Namespaces, and XML Schemas
Due to its flexibility, ability to be manipulated programmatically, and expressive power, XML (along with its associated technologies such as namespaces and schemas) is the most suitable language for a semantic language. The first contribution XML made to the Web was to separate content from representation; in the next iteration, XML is used to add metadata or meaning to content. Currently, the W3C is working on two main XML-based standards for the Semantic Web: Resource Description Framework (RDF) and Web Ontology Language (OWL). Once these standards become fully functional, parts of the Semantic Web should start to come together.

Resource Description Framework
RDF is one of the cornerstones of the efforts made in the direction of the Semantic Web. It is a language for representing information about resources in the World Wide Web and its syntax is XML. RDF represents a data model or metadata, i.e., a common framework for expressing information that can be shared across applications. According to the W3C, RDF represents information "by generalizing the concept of a 'Web resource.'"

The RDF framework is built on three pillars:
1.   Resource: Anything that can have a URI; this includes all the Web's pages, as well as individual elements of an XML document. An example of a resource is http://www.example.org/flight.
2.   Property: A resource that has a name and can be used as a property, for example, Origin or Destination.
3.   Statement: Comprises the data model for RDF and consists of the combination of a resource, a property, and a value. For example, if the resource is "VS018", the property is "Origin", and the Value in this statement is "New York".

Virgin Airlines stores the information about flight VS018 from New York to London in XML-based RDF in the manner shown in Listing 1. The RDF in Listing 1 describes our flight from New York to London very accurately, but what if you're going to another conference in San Diego in June and you want to search for a flight to San Diego? Software agents require that similar concepts be described in the same manner in order to search information efficiently. In other words, each industry has to design the metamodel of the information pertinent to its domain. This calls for a schema to constrain and formalize the language of the RDF, i.e., to specify what constitutes a generic "flight." For the Semantic Web, we can use RDF schemas, or even better, we can use the OWL to model ontologies written in RDF.

Web Ontology Language
OWL is the XML Schema for RDF; OWL allows the definition of new vocabularies and ontologies that are written in RDF. According to the W3C, OWL is "intended to provide a language that can be used to describe the classes and relations between them that are inherent in Web documents and applications." Just as RDF has triplets of subject, predicate, and object, OWL has classes and properties and constraints on the way those classes and properties can be employed. A set of OWL assertions loaded into a reasoning system is called a knowledge base (KB). OWL is used to publish and share ontologies on the Web.

Let's look at our example and discuss inheritance, cardinality, and association. One example of a class is Flight. A Flight class has properties such as Number, Origin, Destination, Departure, Arrival, and Class.

Namespaces are used at the top of the .owl file to specify the origin of the various vocabularies used in the document.

xmlns = "http://www.example.org/flight#"
xmlns:owl = "http://www.w3.org/2002/7/owl#"
xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs= "http://www.w3.org/2000/01/rdf-schema#"
xmlns:xsd = "http://www.w3.org/2000/10/XMLSchema#">

Once namespaces are established we begin with an assertion that what follows is an OWL ontology.

<owl:Ontology rdf:about="http://www.example.org/flight">

The most basic concepts in a domain should correspond to classes that are the roots of various taxonomic trees. Every individual in the OWL world is a member of owl:Thing. Thus each user-defined class is implicitly a subclass of owl:Thing. Domain-specific root classes are defined by simply declaring a named class. For our sample contacts domain, we create one root class: Flight.

<owl:Class rdf:ID="Flight"/>

Say we want to divide our flights into international and domestic flights. Hence, InternationalFlight will be a subclass of Flight. We express this in OWL in the following manner:

<owl:Class rdf:ID="InternationalFlight">
<rdfs:subClassOf rdf:resource="#Flight" />

Properties are used to make assertions about classes. There are two types of properties: object and datatype. Properties can express constraints between elements. Two kinds of restrictions that can be used are the domain of a property and the range of a property.

<owl:ObjectProperty rdf:ID="flightsPerDay">
<rdfs:domain rdf:resource="#InternationalFlight"/>
<rdfs:range rdf:resource="#Integer"/>

We're defining a property that specifies how many flights per day occur for an international flight, and the property is restricted to an Integer.

It's possible to specify the type of association in terms of cardinality between two entities. In our example, we note that there is a restriction on the property flightsPerDay for an international flight, which says that there is only one international flight per day. This is important information for a software agent that is searching for options on which flight to get for the New York to London leg.

<owl:Class rdf:ID="InternationalFlight">
<rdfs:subClassOf rdf:resource="#Flight"/>

<owl:onProperty rdf:resource="#flightsPerDay"/>

Ontology Mapping
The key to ontologies is that they can be shared and therefore increase efficiency and interoperability. However, it is sometimes the case that two different organizations have two different names for the same concept, i.e., the ontologies are different. In such cases, the ability to map RDF schemas or ontologies is crucial to maintaining the advantages of the Semantic Web.

In OWL, there are several constructs that can be used for ontology mapping. Two of these attributes are sameClassAs and samePropertyAs - they indicate that a particular class or property in one ontology is equivalent to a class or property in another ontology.

Say, for example, that British Airways calls a flight by the class name "AirJourney" while Delta refers it as "Flight". How can a software agent know that the two mean the same thing (and recall that semantics are about concepts and meanings)? Ontology mapping between the British Airways and Delta ontologies is required. An example is shown below.

<owl:Class rdf:ID="Flight">
<owl:sameClassAs rdf:resource="AirJourney"/>

Web of Trust
We can model the information, but how do we trust the information that we get from the Semantic Web, and how do we protect our information? If my software agent finds two travel agents, and one says the price for a Virgin Atlantic ticket is $180 while the other says the price is $210, whom do I believe? In the Semantic Web, we depend on digital signatures and community networks.

Digital signatures
Digital signatures are necessary to ensure that the information that claims to be coming from a source was not tampered with before it got to you, and that its origin was indeed the named source. Based on mathematics and principles of cryptography, digital signatures allow signed RDF documents to be trusted. According to the W3C, "The combination of metadata and digital signature capabilities will aid in building a genuine Web of Trust."

Digital signatures address the problems of message integrity, data origin authentication, signer authentication, and nonrepudiation of sending a message. Furthermore, signed XML combined with the RDF will provide a layer of authentic metadata that will improve search engine capability, support intelligent software agents, and create new ways of cataloging information for improved navigation.

FOAF (Friend of a Friend)
Even if we could verify that the information did come from a particular source, how would we decide whether to trust that source? One way is to trust sites that have been verified as trustworthy by organizations or even a community of your friends. The latter thought was the impetus behind the Friend-of-a-Friend (FOAF) idea.

FOAF falls under the rubric of social networking. The FOAF vocabulary allows you to specify the information necessary for membership to a community, such as name and e-mail address. However, you could augment this information to find out about the interests of other members, or even, in line with our argument above, to gather information regarding which site to trust.

FOAF provides one opportunity to build a prototype of the "Web of Trust" that Berners-Lee refers to in his Semantic Web roadmap.

Software Agents and Semantic Web Services
Ontologies comprise the knowledge representation component of the Semantic Web, but it is incomplete without software programs that can communicate with each other. We still need a mechanism by which a software agent goes to Virgin Atlantic and requests information on flights to London from New York on May 5, 2003.

The best application for invoking other applications on the Web using request parameters is Web services. Web services is a layer of abstraction above software programs and allows services to be located and invoked across the Web. Programs written in various programming languages on different platforms can call each other using the Web services interface. The services offered by a company are published in a public registry that's used to locate them. Method invocations and results are communicated using the Web services messaging framework. Web services is important for universal interoperability and integration and will be the key enabler for software agents in the Semantic Web.

There are three main standards for Web services: WSDL (Web Service Description Language), an XML-based syntax that describes the functions provided by a Web service; UDDI (Universal Description, Discovery, and Integration), an XML-based syntax used to develop a registry of services that can be published on the Web; and SOAP (Simple Object Access Protocol), the most common protocol for carrying the messages that invoke Web services. Using Web services, Virgin Atlantic can publish its services in a registry and anyone who wants to call a find flight command can send a message to invoke the method published in its registry.

DARPA (Defense Advance Research Projects Agency) has been working on an extension of Web services known as Semantic Web services. Semantic Web services have a declarative, machine-readable API for services. The API would inform the agent of how to use the service, which parameters to provide, and what results will be returned. So Semantic Web services are an enhanced version of Web services; they formalize the language in which we describe and call Web services. DARPA has developed DAML-S, an ontology or semantics for describing the properties and capabilities of Web services. DAML-S sits at the application level above WSDL and describes what is being sent as opposed to describing only how it is being sent (which is the functionality provided by WSDL). In other words, DAML-S will complement WSDL.

Using DAML-S, the user will not have to specify the Web service that it wants; the software agent will be able to discover the capability required by the consumer by looking at the declarations of the capabilities advertised by the marked up Web services. Next, it will compose tasks itself, i.e., it will both find the flight and buy the ticket for it (the composition of complex tasks is not possible with the current state of Web services). Semantic Web services would, therefore, greatly enhance the capability of software agents to search for particular services and are an important step in the direction of implementing the Semantic Web.

Present Efforts and Future Directions
The Semantic Web is the second-generation Web. It weaves together a network of information, which allows more efficiency, greater knowledge-sharing, and ease of use. Ontologies are the key to this interoperability because they determine the language software agents will need to communicate with each other and humans will need to communicate with the agents. As we have seen in this article, the semantics necessary for ontologies are written in XML, or more specifically RDF and OWL using XML syntax.

There are three factors necessary for the success of the Semantic Web: first, the establishment of standards by the W3C; second, the development of technologies that facilitate the implementation of software agents and other aspects of Berners-Lee's vision; and third, the production of tools that encourage people to adopt the technologies that will facilitate the universality of the Semantic Web.

The W3C, led by Tim Berners-Lee and Eric Miller, has made great progress in the standards established for the Semantic Web. In 2002, several new recommendations and working drafts have emerged for OWL and RDF, the two main standards for the Semantic Web. Detailed examples and guides are provided for users who want to mark up their information on the Web.

Technologies such as Web services and digital signatures are just two examples of relatively recent developments that will greatly facilitate the implementation of the Semantic Web. Examples of implementations include Music Brainz (www.musicbrainz.org), which provides an encyclopedia of music marked up in RDF; Friend-Of-A-Friend (http://xmlns.com/foaf/0.1), which uses RDF to mark up the identity of community members and provides a basis for a Web of trust; and Retsina Calendar Software Agent (www.softagents.ri.cmu.edu/Cal), which is an agent developed for calendar scheduling by Carnegie-Mellon University.

Regarding encouraging people to mark up their Web information, I tend to agree with James Hendler, Professor at the University of Maryland and a prolific writer on the Semantic Web, that "ideally, most users shouldn't even need to know that Web semantics exist." Tools must be constructed that automatically pop up forms for ontology linkages in order to overcome the initial hesitation that people have in learning semantic markup languages. DARPA is funding a number of such free tools so people will mark up their Web pages. One example is an ontology editor, Protégé, developed by Stanford University, which is free and available for download from the Stanford Web site.

Of course, we have spoken of more than just individual Web pages; in our hypothetical example, we considered the importance of ontologies and these are usually developed by industry consortiums. Luckily, creating ontologies is something that is already underway. Many industries have realized that they need industry standards to facilitate inter- and intra-firm communication. One example of this is FpML (Financial Product Markup Language), an ontology for financial instruments written in XML syntax. Its goal is to establish a representation of concepts - an ontology - for all financial trading firms to be able to use for their trading purposes. If industry consortiums that are creating ontologies in XML instead write them in RDF, which is an XML-based syntax, they will have taken the first but important step toward creating the Semantic Web.

These efforts all point to the growing importance, and in my mind, the inevitability of the establishment of the Semantic Web. Just because it sounds like science fiction doesn't mean it's impossible. The Semantic Web is an incredibly exciting and potential place for developers to work. It will revolutionize the way we interact, live, and do business today. If you have seen movies like The Matrix and Minority Report, you've glimpsed the new kind of artificial intelligence that uses the Web to process information rapidly and automatically. Who knows? One day, you might very well be able to just speak to your small Palm Pilot or laptop instead of typing in the commands. Even today, companies such as IBM produce simple voice recognition software that allows you to speak to your computer. The key is that the computer needs a defined set of semantics for it to understand your commands and be able to communicate with other software programs on the Web. For now, the W3C is defining standards, new technologies like Web services and XML Schemas have emerged that will make the transition easier, and industries and companies are focusing on making better models to represent their knowledge.

I predict that industries will develop ontologies that will be used for their internal communication. Eventually, each industry such as financial services, retail, and shipping will merge its internal ontologies and represent a coherent protocol for communication with its systems. At that point, the Semantic Web will evolve from existing in pockets to becoming a universal infrastructure. Eventually, with increasing unambiguous markup of Web content, the Semantic Web will evolve to Tim Berners-Lee's vision as "an extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation."


  • Berners-Lee, Tim. (1998). "Semantic Web Roadmap": www.w3.org/DesignIssues/Semantic.html
  • Resource Description Framework: www.w3.org/RDF/
  • Dumbill, Ed. (2000). "The Semantic Web: A Primer": www.xml.com/pub/a/2000/11/01/semanticweb
  • Palmer, Sean B. (2001). "The Semantic Web: An Introduction": http://infomesh.net/2001/swintro
  • Web Ontology Language (OWL) Guide (Version 1.0, W3C Working Draft 4 November 2002): www.w3.org/TR/2002/WD-owl-guide-20021104
  • Semantic Web Activity: Advanced Development: www.w3.org/2000/01/sw
  • Cowles, Paul. (2002). "Web Services and the Semantic Web." Web Services Journal. December: www.sys-con.com/webservices/article.cfm?id=419
  • Hendler, James. (2001). "Agents and the Semantic Web." IEEE Intelligent System. Number 2.
  • Berners-Lee, Tim; Hendler, James; and Lassila, Ora. (2001). "The Semantic Web." Scientific American. May: www.sciam.com/article.cfm?articleID =00048144-10D2-1C70-84A9809EC588EF21
  • DARPA Agent Markup Language: www.daml.org
  • "DAML-S: Web Service Description for the Semantic Web" by The DAML Services Coalition. The First International Semantic Web Conference (ISWC), Sardinia (Italy), June 2002: www.daml.org/services/ISWC2002-DAMLS.pdf
  • McIlraith, S.; Son, T.C.; and Zeng, H. (2001). "Semantic Web Services." IEEE Intelligent Systems. March/April: www.daml.org/services/ieee01-KSL.pdf
  • Dumbhill. Ed. (2002). "XML Watch: Finding Friends with XML and RDF." IBM developerWorks. June: www-106.ibm.com/developerworks/ xml/library/x-foaf.html
  • More Stories By Ayesha Malik

    Ayesha Malik is a Senior Consultant of Object Machines, a software engineering firm providing Java technology and XML solutions to businesses. Ayesha has worked extensively on large XML and messaging systems for companies such as Deutsche Bank and American International Group (AIG). Most recently, she has been researching new ways to make schemas extensible and object-oriented.

    Comments (3)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

    IoT & Smart Cities Stories
    Moroccanoil®, the global leader in oil-infused beauty, is thrilled to announce the NEW Moroccanoil Color Depositing Masks, a collection of dual-benefit hair masks that deposit pure pigments while providing the treatment benefits of a deep conditioning mask. The collection consists of seven curated shades for commitment-free, beautifully-colored hair that looks and feels healthy.
    The textured-hair category is inarguably the hottest in the haircare space today. This has been driven by the proliferation of founder brands started by curly and coily consumers and savvy consumers who increasingly want products specifically for their texture type. This trend is underscored by the latest insights from NaturallyCurly's 2018 TextureTrends report, released today. According to the 2018 TextureTrends Report, more than 80 percent of women with curly and coily hair say they purcha...
    The textured-hair category is inarguably the hottest in the haircare space today. This has been driven by the proliferation of founder brands started by curly and coily consumers and savvy consumers who increasingly want products specifically for their texture type. This trend is underscored by the latest insights from NaturallyCurly's 2018 TextureTrends report, released today. According to the 2018 TextureTrends Report, more than 80 percent of women with curly and coily hair say they purcha...
    We all love the many benefits of natural plant oils, used as a deap treatment before shampooing, at home or at the beach, but is there an all-in-one solution for everyday intensive nutrition and modern styling?I am passionate about the benefits of natural extracts with tried-and-tested results, which I have used to develop my own brand (lemon for its acid ph, wheat germ for its fortifying action…). I wanted a product which combined caring and styling effects, and which could be used after shampo...
    The platform combines the strengths of Singtel's extensive, intelligent network capabilities with Microsoft's cloud expertise to create a unique solution that sets new standards for IoT applications," said Mr Diomedes Kastanis, Head of IoT at Singtel. "Our solution provides speed, transparency and flexibility, paving the way for a more pervasive use of IoT to accelerate enterprises' digitalisation efforts. AI-powered intelligent connectivity over Microsoft Azure will be the fastest connected pat...
    There are many examples of disruption in consumer space – Uber disrupting the cab industry, Airbnb disrupting the hospitality industry and so on; but have you wondered who is disrupting support and operations? AISERA helps make businesses and customers successful by offering consumer-like user experience for support and operations. We have built the world’s first AI-driven IT / HR / Cloud / Customer Support and Operations solution.
    Codete accelerates their clients growth through technological expertise and experience. Codite team works with organizations to meet the challenges that digitalization presents. Their clients include digital start-ups as well as established enterprises in the IT industry. To stay competitive in a highly innovative IT industry, strong R&D departments and bold spin-off initiatives is a must. Codete Data Science and Software Architects teams help corporate clients to stay up to date with the mod...
    At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
    Druva is the global leader in Cloud Data Protection and Management, delivering the industry's first data management-as-a-service solution that aggregates data from endpoints, servers and cloud applications and leverages the public cloud to offer a single pane of glass to enable data protection, governance and intelligence-dramatically increasing the availability and visibility of business critical information, while reducing the risk, cost and complexity of managing and protecting it. Druva's...
    BMC has unmatched experience in IT management, supporting 92 of the Forbes Global 100, and earning recognition as an ITSM Gartner Magic Quadrant Leader for five years running. Our solutions offer speed, agility, and efficiency to tackle business challenges in the areas of service management, automation, operations, and the mainframe.