Industrial IoT Authors: Elizabeth White, Stackify Blog, Yeshim Deniz, SmartBear Blog, Liz McMillan

Related Topics: Industrial IoT

Industrial IoT: Article

Semantics and Context

One of the core tenets of XML is its extensibility and flexibility

Although XML defines each data element in a given transaction (the semantics), there's no mechanism to also communicate the business context. This represents the difference between reading XML and understanding the business impact of the transaction. The use of namespaces, numeric values, and time stamps all create some context when looking across transactions or business entities. In this article we'll discuss the difference between semantics and context and the challenges this difference creates relative to performance and scalability.

One of the core tenets of XML is its extensibility and flexibility. XML facilitates these tenets because it's self-describing and has a DTD that provides the data structure necessary for reading the content of the associated document. This is the capability that sits at the core of the XML "hype versus hope" debate.

In today's increasingly dynamic business world this self-describing capability provides hope against obsolescence. By providing a mechanism for maintaining systems that can communicate changes in semantics as part of the transmission of a document, XML provides a way to reduce the maintenance cost for solutions based on changing business requirements. One of the main justifications for introducing XML into a solution is that this adaptability enables the core solution to scale and perform despite changes in requirements or exceptions to processes.

However, this same capability contributes to the hype surrounding XML. Take the assumption that if an XML document is self-describing and the tags associated with the description are clear, then the system responsible for reading the document is able to understand the implications of the change. A common mistake is to assume that this description in the XML actually de- scribes the changes. In reality the change is only implicit - it's impossible to connote the intent of the change, understand the business context that required the change to occur, or automatically derive how a new data element should be handled.

This article looks at a specific business example, differentiates the context from the semantics, and discusses the issues around these differences with respect to namespaces, numeric values, and date/time stamps. By separating the semantics from the context, one can consider these issues with respect to maintenance costs to improve system performance and scalability.

The Problem Definition
Let's look at the communications associated with placing a request for a proposal, beginning with an enterprise seeking to purchase specific products from its suppliers. In the center of Figure 1 is the Enterprise that Betty Buyer works for. On the right, left, and bottom of Enterprise are potential suppliers of the widgets that Betty Buyer wants to purchase. Betty must submit a Request for Quote (RFQ) to each supplier before she makes the purchase. Betty's Enterprise manages the RFQ through a homegrown system.

To reach each supplier, the RFQ must be sent via the Exchange, then on to the appropriate supplier. To accomplish this, Betty, through her RFQ system, must know the name of each supplier within her namespace as well as how it resolves that name to the namespace of the Exchange. While Betty may know Sam's company as "Best Supplier," the Exchange may use "Bsupplier, Inc.," or the supplier's D&B number. Indeed, Best Supplier could participate either directly with the Enterprise, through the Exchange on the right of Figure 1, or as a participant in Enterprise's Private Exchange. In this case, Betty's RFQ system may need to maintain three different names for Best Supplier.

To fully understand the contextual issues, let's look at the RFQ Betty sends through an intermediary like this Exchange. Betty wants to send the information in this RFQ to three of the potentially hundreds of registered suppliers. However, she doesn't want to duplicate the effort of creating the RFQ in the Exchange's system for every one of the many RFQs she creates. If the information is in XML, the Exchange can import and convert the RFQ into its system. But Betty still must address the RFQ to the appropriate suppliers as they're referenced in the Exchange's namespace. The Exchange has to resolve the business context - the product, terms, and delivery date communicated in the RFQ.

Since every participant in this scenario has a different software solution, XML is the ideal choice for communicating and translating the semantic information in the RFQ because XML is system independent. Betty Buyer creates an RFQ on Enterprise's RFQ system, which in turn creates an XML document that's similar to the example in Listing 1. Betty Buyer sends this document to all suppliers, then waits for a response from each. The self-describing nature of the XML RFQ enables each receiving supplier to map the tagged pairs of the RFQ data to its own internal system representation so the supplier can process the RFQ and respond in kind. But this isn't as simple as it sounds. Let's see what happens with the namespace, numeric, and date information that have contextual differences between the business entities.

Namespaces - What's in a Name?
Every system manages its own namespace. Names of companies, people, and items are all stored and referenced locally. When this is done by systems that must share information, the practice creates significant challenges. Let's consider how it impacts Betty Buyer's ability to buy from Sam Supplier.

To force a business context, the larger exchanges state that the transactions must be created in their environment, so all participants use the processes and business context enforced by each exchange. Consider how this works. An exchange maintains a centrally managed catalog that resolves part number, description, and pricing namespace issues. The exchange also maintains its own unique company namespace, document namespace, and all the business processes that allow two companies to coordinate the buying, selling, shipping, invoicing, and sometimes even the exchanging of funds.

But each exchange represents only a fraction of the entire market, so the practice of suppliers participating in multiple exchanges has become rather common. And despite the rapid growth in the number of exchanges that are currently available, many large enterprises believe they gain a competitive advantage by maintaining their own private exchange. A private exchange allows the enterprise to define and enforce its own unique business process and forces its suppliers to comply with that process.

While participating in multiple exchanges (both public and private) may seem to resolve the immediate business issue of gaining maximum market exposure and control of the business process, it actually defeats the purpose of having an intermediated marketplace. The intermediated marketplace was supposed to enable a wide range of suppliers to bid on an enterprise's RFQ, and the RFQ was supposed to be sent only once. If each enterprise participates in multiple marketplaces while also building its own private marketplace, then the only way to deliver on the promise of the intermediated marketplace is to enable all exchanges (both public and private) to communicate with each other. But this reintroduces the original namespace resolution issue - only an order of magnitude more complex.

One solution is to explicitly state that all B2B XML documents include data elements that are tied to a specific namespace. For example, a company may reference internal specifications by URL. In the earlier example, Betty's listing would need to tag part numbers as belonging to the Exchange's catalog so the receiving system could call that catalog. To handle this issue, a namespace specification that's based on the Universal Resource Indicators (URI) standard has been suggested. While the URI can eliminate broken links and identify a link universally and unambiguously, their use significantly complicates the parsing of the XML document. When parsing an XML document that contains a namespace reference, the referenced link must be called. Performing just a single Internet link within an XML document would introduce significant delays; when multiple links are referenced, the parsing challenge becomes even more problematic.

Products - What Do You Want?
The namespace issue isn't limited to company names. It becomes an issue for every object. At times this is made even more difficult by business practices. For example, it's not unusual for a supplier to offer the same product at different prices in different markets. One may be a spot market for inventory overstock, another may be an industry-specific market operated by a consortium, while an individual enterprise might have its own contract with a preferred supplier that guarantees 20% off list price.

But a supplier is unlikely to use the same part number across every ex- change. This would make it too easy for buyers and competitors to track its pricing policies. Instead, in the case of our example, Best Supplier maintains a different catalog of products for each market. Some items use the same part number, while others don't. To make sure Betty is getting the best possible deal, she must submit the RFQ to all three markets. But to do this, she must also know what product number and description is used to reference this product within each market.

Numeric Values - How Much Do You Want?
A more basic contextual issue involves amounts. For example, let's say Betty Buyer has requested 10,000 wid- gets. If you look at Table 1, the Exchange catalog has one supplier selling widgets in bulk packs of 12. How does the conversion between these units occur? What's the mechanism for communicating optional units of measure in an XML transaction? If Betty is willing to buy an overage, she can get a better price. But none of this information is represented in the semantics of the XML document.

Again, there's a semantic approach to solving this contextual issue if standards are considered. Most specifications require that units of measure be optional elements in the definition. Some even use attributes to communicate the amount that's in the document. However, these semantic capabilities are often associated with the accompanying business context. For suppliers who agree to use these standards, such as Open Applications Group, Inc., or RosettaNet, the solution can leverage the standards-based approach that's considered and addressed through semantics that map the specific contextual issues. In Listing 2 we've added a UOM section to the XML that will allow for communication of the specific semantics.

Date and Time - When Do You Want It?
Finally, let's look at date and time as a function of the time zone you're operating in and the regional notation of the data format. The contractual issues associated with physical delivery create this challenge. For example, the suppliers responding to the RFQ may be communicating deliveries based on their time zone and region. Best Supplier is located on the West Coast, and their ability to deliver widgets, as stipulated in the RFQ, would require next-day shipping, so their shipping costs may be higher. By modifying the XML DTD to allow for separation of the elements of a date, we can address some of the issues associated with the date-specific semantics.

However, anything beyond simple transformations requires a date data type. The solution for date and numerical issues will be much simpler when XML schemas arrive. The draft specification will allow for multiple data formats that in turn would allow the schema-based semantics to address these issues.

As you can see in each of these examples, although XML allows for easy resolution of the semantic differences between the business entities, the business context presents a greater challenge. Once the contextual issue is well understood, it becomes clear that XML can solve problems only within a finite domain, one identified by a shared context. For those who believed the hype generated by XML, this limitation is disappointing. For those wrestling with how to communicate with business partners, XML continues to deliver incredible business benefits:

  • Flexibility:Look at the date format and the ability to configure time zones.
  • Extensibility: Extend semantics to allow for the communication of context through optional elements and attributes.
  • Ease of use: Changes in context can be communicated through a revised DTD without breaking the overall solution.

Just because a document is self-describing and solutions can correctly read the document, it doesn't necessarily follow that the same system can understand the context. Many exchanges know full well the issues of business context in trying to create a single integrated catalog of products. Think of the product code for widgets and how that's represented to a supplier internally versus how it's presented to a buyer bidding on widgets through an exchange that represents hundreds of suppliers.

There are B2B specifications that allow for optional information for each of these capabilities. But the introduction of a new business context makes the entire solution more difficult to maintain. Hopefully, these standards will consolidate to a few, since adhering to multiple standards drastically reduces the efficiencies promised by adopting XML into your solution.

Certainly, one key is the completion of the specifications under review by the WC3. Consider the ability of XML schemas to differentiate between numbers, dates, and text - if combined with the ability for an Xquery to calculate and transform content, one could see a tool set built on these ratified specifications that would allow for solutions to translate between business context just as XML is used to transform B2B transactions today. Additionally, initiatives like Universal Description Discovery and Integration (UDDI), which focus on the creation of namespaces in the Internet as opposed to the proliferation of additional flavors of the same solution, present shared namespaces where context can be published and shared by business entities. If we're to avoid the cynicism and backlash from our business sponsors when the XML-hype bubble breaks, we need to drive our solutions to minimize the propagation of multiple standards and dialects.

The final article in this series will focus on parsers. As the processing engine for XML, transaction parsers are core to the use of XML. We'll look at the scalability of these engines and their ability to efficiently handle the emerging dialects of XML. We'll also summarize the series.

If you'd like to discuss a particular aspect of this or any other topic, e-mail me at [email protected]

Glossary SE·MAN·TICS

  • Linguistics: The study or science of meaning in language forms.
  • Logic: The study of relationships between signs and symbols and what they represent. In this sense, also called semasiology.
  • Semantics: The meaning of a string in some language, as opposed to syntax, which describes how symbols may be combined independent of their meaning.

The part of a text or statement that surrounds a particular word or passage and determines its meaning.

The circumstances in which an event occurs; a setting.

The unique definition of companies and people within a business entity.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

IoT & Smart Cities Stories
Early Bird Registration Discount Expires on August 31, 2018 Conference Registration Link ▸ HERE. Pick from all 200 sessions in all 10 tracks, plus 22 Keynotes & General Sessions! Lunch is served two days. EXPIRES AUGUST 31, 2018. Ticket prices: ($1,295-Aug 31) ($1,495-Oct 31) ($1,995-Nov 12) ($2,500-Walk-in)
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
Nicolas Fierro is CEO of MIMIR Blockchain Solutions. He is a programmer, technologist, and operations dev who has worked with Ethereum and blockchain since 2014. His knowledge in blockchain dates to when he performed dev ops services to the Ethereum Foundation as one the privileged few developers to work with the original core team in Switzerland.
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...
The challenges of aggregating data from consumer-oriented devices, such as wearable technologies and smart thermostats, are fairly well-understood. However, there are a new set of challenges for IoT devices that generate megabytes or gigabytes of data per second. Certainly, the infrastructure will have to change, as those volumes of data will likely overwhelm the available bandwidth for aggregating the data into a central repository. Ochandarena discusses a whole new way to think about your next...
CloudEXPO | DevOpsSUMMIT | DXWorldEXPO are the world's most influential, independent events where Cloud Computing was coined and where technology buyers and vendors meet to experience and discuss the big picture of Digital Transformation and all of the strategies, tactics, and tools they need to realize their goals. Sponsors of DXWorldEXPO | CloudEXPO benefit from unmatched branding, profile building and lead generation opportunities.
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...
All in Mobile is a place where we continually maximize their impact by fostering understanding, empathy, insights, creativity and joy. They believe that a truly useful and desirable mobile app doesn't need the brightest idea or the most advanced technology. A great product begins with understanding people. It's easy to think that customers will love your app, but can you justify it? They make sure your final app is something that users truly want and need. The only way to do this is by ...
DXWorldEXPO LLC announced today that Big Data Federation to Exhibit at the 22nd International CloudEXPO, colocated with DevOpsSUMMIT and DXWorldEXPO, November 12-13, 2018 in New York City. Big Data Federation, Inc. develops and applies artificial intelligence to predict financial and economic events that matter. The company uncovers patterns and precise drivers of performance and outcomes with the aid of machine-learning algorithms, big data, and fundamental analysis. Their products are deployed...