| Close Window |
The Semantic Web is a hot topic in information circles today, and its adoption will largely depend on stakeholders understanding its potential benefits and tools vendors providing an easy entry for developers to learn and work with its related technologies.
The Semantic Web Vision
Imagine this scenario. You're a software consultant, and today you're taking a working lunch with one of your biggest clients. Her company has an emergency project at one of its remote offices, and they need your consulting services there for the next two weeks. You need to get there as soon as possible to begin work, so you take out your hand-held computer, activate its Semantic Web agent, and instruct it to book a nonstop flight that leaves before 10 a.m. the next day. You want an aisle seat if it's available. Once the agent finds an acceptable flight with an available aisle seat, it books it using your credit card and assigns the charges to your client's account in your accounting application. It also warns you that you'll be missing a dentist appointment back home and adds a note to your calendar reminding you to reschedule. Next, you specify that you want a car service to the client's site, so the agent scans the availability of limos in the area with "very good" or higher service ratings and books an appointment to have you picked up 30 minutes after your flight lands. The agent also books you at your favorite hotel chain, automatically securing the lowest rate using your rewards card number. Finally, your agent updates your calendar with your trip information and prints out your confirmation documents back at your office.
With just a few clicks your Semantic Web software agent found and booked your flight, hotel, and car service, then updated your accounting system and calendars automatically. It even compared your itinerary to your calendar and detected the scheduling conflict with your dentist appointment. To do all this, the agent had to find, interpret, combine, and act on information from multiple sources and multiple, disparate applications and data repositories. During those few minutes it takes your software agent to book your trip, you wonder what you ever did before the Semantic Web. This example, of course, is a long-term vision for applying the Semantic Web. It's one that may or may not come to fruition - only the future will tell. However, the vision itself is important for understanding the potential of Semantic Web technologies.
The Semantic Web is currently the focus of W3C Working Groups (www.w3.org/2001/sw/) and is considered the next step in Web evolution. In the Semantic Web, data itself becomes part of the Web and is able to be processed independently of application, platform, or domain. This is in contrast to the World Wide Web as we know it today, which contains virtually boundless information in the form of documents. We can use computers to search for keywords in these documents, but search results still have to be read and interpreted by humans before any useful information can be extrapolated. Computers can present you with information but can't understand what the information is well enough to display the data that is most relevant in a given circumstance. The Semantic Web, on the other hand, is about having data as well as documents on the Web so that machines can process, transform, assemble, and even act on the data in useful ways. To accomplish this, the Semantic Web relies on structured sets of information and inference rules that allow applications to "understand" the relationships between different data resources.
The true impact of the Semantic Web will not be known for quite some time, but some proponents have asserted that it will lead to the evolution of human knowledge itself by allowing people and machines - for the first time - to quickly filter and synergize the massive amounts of data that exist in the world in a relevant, productive way.
As with any potentially revolutionary technology, the scope of the Semantic Web's evolution will depend on several factors, including industry buy-in, the technology learning-curve, and the availability of productive Semantic Web development tools. The first factor, industry buy-in, is largely dependent on the other factors of developer proficiency and tools support.
Spinning Meaning and Relationships
The Semantic Web is a "web of data" that not only harnesses the seemingly endless amount of data on the World Wide Web, but also connects that information with data in relational databases and other noninteroperable data repositories. Considering that relational databases house the majority of enterprise data today, the ability of Semantic Web technologies to access and process them alongside other data from Web sites, other databases, XML documents, and other systems increases the amount of available data exponentially. In addition, relational databases can adapt easily to the Semantic Web model since they already include a great deal of semantic information. Database tables and columns are created based on the relationships between the data they house, and this organization reveals some of the meaning - the semantics - of the data.
Implementing the Semantic Web requires adding semantic metadata to the information resources that are available on the Web or on internal networks. This will allow machines to effectively process the data based on the semantic information that describes it. When there is enough semantic information associated with data, computers can make inferences about the data, i.e., understand what a particular data resource is and how it relates to other data.
XML has paved the road by adding some metadata in the form of human-readable tags that describe data. In addition, XML documents can include information about the author of a Web page, relevant keywords for search engine optimization, and the software tools used to create the XML file, for example.
Before XML, data was stored in flat file and database formats, where most data was proprietary to an application. XML came along and made data interoperable within a single domain, i.e., within the domain defined by a schema or a set of related schemas that define the structure of related documents. By itself, XML provides syntactic interoperability only when both parties know and understand the element names used. If I label an element <price>12.00</price> and someone else labels it <cost>12.00</cost>, there's no way for a machine to know if those are the same thing without the aid of a separate application to map between the elements. Semantic Web technologies help address this problem by making tags understandable not just to humans - but to machines as well.
The first step required for machines to understand data is to get that data into a uniform format, where, for instance, a field labeled "street" always has the same format and contains the same type of information, and so on. This type of functionality can be found today on Web sites that use forms that allow users to enter information and run a query, such as airline sites that allow visitors to search for and book flights based on a variety of criteria. However, considering the amount and variety of data available from different sources today, this method of data typing does not scale beyond very specific applications.
The next step towards the Semantic Web requires data from multiple domains to be classified based on its properties and its relationship with other data. This is where Semantic Web technologies such as RDF and OWL come in.
RDF and OWL
RDF (Resource Description Framework) is the XML-based W3C standard that forms the basis for the Semantic Web. RDF statements describe a resource (identified by a URI), the resource's properties, and the values of those properties. RDF statements are often referred to as "triples" that consist of a subject, predicate, and object, which correspond to a resource (subject), a property (predicate), and a property value (object). A triple written in plain text is depicted in Figure 1.
By creating triples with subjects, predicates, and objects, RDF allows machines to make logical assertions based on the associations between subjects and objects. However, while RDF provides a model and a syntax (the rules that specify the elements of a sentence) for describing resources, it does not specify the semantics (the meaning) of the resources. To truly define semantics, we need RDFS or OWL.
RDFS (RDF Schema) allows developers to create vocabularies that describe groups of related RDF resources and the relationships between those resources. An RDFS vocabulary defines the allowable properties that can be assigned to RDF resources within a given domain, and it allows creation of classes of resources that share common properties.
In an RDFS vocabulary, resources are defined as instances of classes. A class is a resource too, and any class can be a subclass of another. This hierarchical semantic information is what allows machines to determine the meanings of resources based on their properties and classes.
Building upon RDFS is OWL, which is a much richer, more expressive standard for defining Semantic Web ontologies that formally define the hierarchies and relationships between different resources. Semantic Web ontologies consist of a taxonomy (system of classification) and a set of inference rules from which machines can make logical conclusions. OWL is used to assign properties to classes of resources, and their subclasses inherit the same properties. OWL also utilizes the XML Schema datatypes and supports class axioms such as subClassOf, disjointWith, etc., and class descriptions such as unionOf, intersectionOf, etc. Many other advanced concepts are included in OWL, making it the richest standard ontology description language available today. There are three flavors of OWL, each with increasing flexibility: OWL Lite, OWL DL, and OWL Full. Developers choose the OWL dialect to work with based on the level of expressive restriction they need in their ontology.
Because RDF, RDFS, and OWL documents express hierarchies and relationships between resources, they are often created and conceptualized in a graphical manner to make the underlying relationships immediately obvious. Figure 2 shows an example of a simple RDF graph.
Even in this simple example, it's easy to see how a Semantic Web agent could make logical connections based on the defined relationships. For example, since the secret agent is Niki Devgood, and the secret agent drives a red convertible, it follows that Niki Devgood drives a red convertible.
Complex ontologies are represented with multiple, interdependent graphs that visually reveal the relationships between resources. Once Semantic Web documents are defined and mapped out graphically, they must be coded in RDF/XML or N-Triples format to be accessed programmatically. Unfortunately, the manual coding process can be extremely tedious and error-prone, considering that even simple ontologies can represent hundreds of lines of code and given that neither RDF/XML nor N-Triples provide visual cues as to the hierarchy of the information contained therein. Developers need a way to translate their graphical ontology representations into RDF/XML or N-Triples easily, thus removing a significant barrier to Semantic Web adoption.
Semantic Web Evolution
Even when developers are armed with tools that make Semantic Web development practical, it's important to note that implementation of RDF, OWL, and the Semantic Web as a whole will be a gradual process. Questions about what the Semantic Web is and how it can benefit businesses and individuals echo the initial confusion about why we needed HTTP, HTML, and the basic Web infrastructure before "WWW" was a staple of our daily vocabulary. However, considering how those technologies have proliferated, it's likely that the Semantic Web vision is one that will be realized, even if it's on a small scale initially.
Though there are certainly far-fetched visions of Semantic Web technologies allowing your PC to talk to your refrigerator to auto-generate recipes and shopping lists, the number of scenarios that could potentially benefit from Semantic Web technologies as they continue to evolve is truly impressive. Think of the possibilities opened to everything from crime investigation, scientific research, and literary analysis, to shopping, finding long-lost friends, and vacation planning when computers can find, present, and act on data in meaningful, productive ways. Despite the theoretical possibilities, only time will tell if these advanced Semantic Web visions will come to fruition.
In the meantime, the W3C has put forth a list of practical use cases for Semantic Web technologies (www.w3.org/TR/webont-req/#section-use-cases), several of which are already in place today. For instance, semantic information can be added to resources on Web portals to improve information syndication and increase the productivity of searches within the portal. Because a Web portal generally includes information related to a narrow community of interest, it's well suited to ontological definition. The Semantic Web can also be used to describe non-textual resources, such as multimedia collections that contain audio, video, and other file types, making locating, combining, and utilizing these resources infinitely easier.
Another example is the Dublin Core Metadata Initiative, which applies Semantic Web technologies to create vocabularies that define the properties of informational resources, such as creator, format, creation date, description, and so on. The Dublin Core vocabulary is in use today in a wide variety of projects (a full list is available on dublincore.org).
These few examples make it clear that industry adoption is increasing, and that trend will only continue with the availability of productive RDF and OWL editors.
Visual Semantic Web Tools
Recognizing this need for practical Semantic Web development tools, Altova recently released SemanticWorks 2006. SemanticWorks is a visual RDF/OWL editor that allows developers to define RDF, RDFS, and OWL documents graphically. Its functionality comprises:
The graphical display includes informative icons that indicate item types, containers and collections (bag, sequence, etc.), class descriptions (unionOf, intersectionOf, etc.), class axioms (subClassOf, disjointWith, etc.), property descriptions (subPropertyOf, inverseOf, etc.), and more. These connectors can be inserted using a context-sensitive right-click menu or by selecting them from the toolbar. Yellow boxes encapsulate resources that are defined elsewhere in the ontology, and mouse-over hints display a connector's meaning or a resource's URI.
Based on the visual design, SemanticWorks generates the corresponding code in RDF/XML or N-Triples, depending on the user's preference. This allows developers to focus on the relationships they're defining while leaving the low-level code writing to the application, which reduces the semantic technology learning curve significantly. This is also helpful for viewing the impact of changes during editing, whether you change the visual design and view the corresponding code, or vice versa.
Because neither RDF/XML nor N-Triples code reflects an order or hierarchy, it's very difficult to understand the relationships between resources when an ontology definition is viewed in its text form. SemanticWorks removes this problem by representing RDF and OWL components visually. It separates vocabularies and ontologies into their logical parts in the visual design view to help users immediately understand the relationships between resources and work with all the of components that make up the definition in a logical manner. These ontology components are available on five tabs: Classes, Properties, Instances, allDifferent, and Ontologies. The Classes tab lists all of the classes available in the ontology with a separate window that lists the instances and properties of the selected class. All properties of the ontology are listed on the Properties tab, and a separate window below the tab lists the domain of the currently selected property. All instances are listed on the next tab, and the allDifferent tab lists the resources that are defined as mutually distinct. Last, the Ontologies tab lists all resources that are ontologies, including ontologies that have been imported into the current file. Figure 4 shows the ontology tabs view. Selecting any listed item opens the detailed design view (shown in Figure 3).
In effect, SemanticWorks allows users to graphically compose ontology drawings - which they might otherwise create by sketching on a notebook or whiteboard - with valuable editing help and automatic code generation. By providing a visual design paradigm and removing the need to manually write RDF/XML or N-Triples code, Altova SemanticWorks gives developers a practical, accessible entry into the Semantic Web.
Conclusion
A new generation of applications geared to take advantage of the Semantic Web is waiting to be built. Now, the availability of commercial development tools such as Altova SemanticWorks will help drive developer productivity and industry adoption, leading us one step closer to realizing the Semantic Web vision.
References
© 2008 SYS-CON Media Inc.