|
YOUR FEEDBACK
Did you read today's front page stories & breaking news?
SYS-CON.TV |
TODAY'S TOP SOA & WEBSERVICES LINKS XML Protocols Preparing for Tomorrow - Today
Preparing for Tomorrow - Today
By: Ketan Petal
Dec. 21, 2000 12:00 AM
Today many companies are evaluating the application of XML to their technology initiatives. With all its potential, performance, scalability, and accessibility implications need to be considered when developing an implementation strategy utilizing XML. We all know that XML is the enabler for doing business over the Internet. Its self-descriptive nature simplifies the exchange of data between parties, making it a powerful standard that simplifies B2B communication. Yet, while this ease of interaction has many benefits - speeding transaction times, opening up new channels, establishing access to data never before attainable - it introduces a challenge that could actually inhibit XML's mass adoption. Namely, if XML is as successful as we expect it to be and the number of XML business transactions grows exponentially, can the currently installed B2B infrastructure support such unprecedented levels of activity? This four-part series addresses this question by studying several system components likely to be impacted by the large volume of XML transactions generated in an automated commerce chain. The series will focus on key attributes that are critical for a successful B2B system and demonstrate that proper advanced planning will ensure the scalability and performance of XML-based systems as XML transaction levels increase. This first article explores what happens to a transaction when it's represented in XML and how it impacts the performance and scalability of B2B e-commerce systems. In addition, I'll identify the key attributes of a B2B-XML transaction and the impact of this additional context on performance and scalability. Understanding this cause-and-effect relationship is the first step in assessing the impact of using XML.
The Nature of XML Figure 1 illustrates a typical business transaction, a purchase order (P.O.) between Buyer 1 and Supplier A for 100 widgets. The P.O. transaction content is shown in a printed format, as well as a more compressed delimited format. Historically, both of these formats have communicated the transaction between business partners. But though they contain the information required by the business user, they fail to provide the supporting information required by the user's application to automatically process and act on the content of the transaction. Typically, users rekey this information into their business application in order to process the transaction. Now let's take a look at this same transaction when represented in XML. One advantage of XML is that it provides a description of the content in the transaction that's separate from the actual transaction data. This document type definition (DTD) contains information that describes the data contained in the transaction. In some cases the DTD is included within the body of the XML transaction; in others it's a separate file that's referenced only within the XML transaction. Listing 1 shows the same P.O. transaction in XML. Note that this transaction has a unique DTD - <!DOCTYPE PURCHASEORDER SYSTEM> - that's not included within the XML representation of this P.O. While XML defines the alphabet (encoding) and grammar of a language, it doesn't provide any context, which is needed if a B2B conversation is to have any meaning. Many parallel efforts are driven by a wide range of standards bodies to create dictionaries that will provide context for specific business communities' emerging standards bodies. These efforts hold the promise of providing the context that's lacking in the XML specification itself. One standards body, the Open Applications Group, Inc., (OAGI) (www.openapplications.org), defines its dictionary in terms of business object documents (BOD). Each BOD represents a specific function within the business process. When the data from a P. O. transaction is represented in the form of an OAGI BOD, there will always be an increase in the size of the transaction. Listing 2 demonstrates this increase when the data from Listing 1 is encoded in the OAGI Process P.O. BOD. These examples demonstrate that the OAGI version of the XML is more comprehensive in its content and context than any of the formats shown previously in Figure 1 and Listing 1. The P.O. BOD is flexible enough to handle variations in language, time zone, and units of measure. This additional specification allows more flexibility and enables the recipient of the transaction to act more precisely while executing the order. The OAGI BOD attributes include:
It's Not the Size of the XML That Matters - It's What's in It
That Counts If the source transaction is increased by a factor of 10, then on a linear basis one can predict the impact. If the current B2B capabilities support 100 transactions per supplier per day, then 10 suppliers require 600MB of capacity for the XML transactions alone over the course of a year. Compare this to the 40MB required when the transaction is in a delimited form, and the effect of XML utilization on bandwidth and storage capacity becomes apparent. Note that these figures don't take into account the overhead associated with the indexing, filtering, and segmentation of the transaction for the purpose of retrieval, nor do they account for the size of the DTD, which is accessed with the transaction at the time of processing.
All Tags Are Equal... But Some Are More Equal Than Others
The application that receives an XML transaction must typically take some action based on a numeric calculation performed on the data contained in the transaction. Since numeric fields in XML are represented as text, the parser must first convert the data into a numeric representation. This must be done prior to the application performing its calculation. The performance implications of this XML implementation detail are more difficult to predict. In general, numeric calculations are more performance intensive than date calculations. Text calculations are the least intensive. To estimate the impact of this attribute, the logic applied to a given transaction must be broken down into three types of discrete elements for the transaction. The more calculations based on a number or date element, the greater the performance and scalability impact to the system.
Validate Now or Validate Later....Either Way,
You Must Pay One alternative to improve performance is to embed the logic wherever possible in the DTD as required elements or through qualifiers. While this reduces the runtime processing, it also makes the test for a well-formed transaction more processor intensive since there are additional steps to test this transaction against. This impacts performance and affects the scalability of the system. Flexibility presents some additional challenges or requires an extra step, which is eliminated as soon as the transaction is validated.
Planning for Performance
and Scalability For example, consider the choice between a system that stores XML transactions and one that resolves transactions into a specific database schema. Considering the performance implications of XML's key attributes, it's important to weigh the need for change in the transaction and the variations to those supplier transactions versus the need to resolve the transactional content into its constituent data types and relationships for application performance. If an XML system must accept a significant volume of transactions, and those transactions are consistent across suppliers, then the need for adaptability is secondary. In this case the system could be designed to utilize XML as an interoperability layer, and the transactions can be resolved into a database as they arrive. Mapping directly into a database structure, however, isn't always appropriate. For example, if there are many different types of transactions across a range of time with many systems, it becomes extremely difficult, if not impossible, to know in advance exactly what data elements will need to be stored or the relationship between elements. Without this advance knowledge, the database schema can't be designed to maximize performance. If the situation is more dynamic, however, the decision may be different. Consider the need to provide dynamic analytics on a multitude of transaction types, across a wide span of time, and with various systems. In this instance the required flexibility will almost certainly demand the data be stored in native XML. Knowing that more performance is required when handling XML, one must design appropriate performance and scalability into the system. Knowing the influencing XML factors, such as the cost of numeric calculations or growth in size of the transaction, allows the system to be architected so it can grow as needed.
Parting Thoughts To maximize the performance and scalability of an XML system, you must balance the requirements of the enterprise with the role of XML. If the requirements dictate an adaptable dynamic solution where the evolution of the solution is unclear, then maintaining the transactions within the system as native XML will pay dividends that far outweigh the performance impact of doing so. If it's clear that the enterprise solution can be bound to a single data model that will evolve slowly and in a predictable way, then performance and scalability can be allowed to dictate the system implementation, and the role of XML becomes that of an integration layer that maps nonconforming transactions into the relational data model. Given the turbulence and infancy of today's B2B landscape, I recommend focusing on the opportunity XML offers to understand fully the impact of that decision from the performance and scalability perspective. As the adoption of XML grows and as XML tools and applications become more prevalent, the performance and scalability discussion will focus on the specific implementation details. Part 2 will focus on storage and retrieval issues associated with using XML. I'll discuss the scalability, performance, and context implications associated with storing XML in its native format versus resolving it to a database. If you'd like me to discuss some particular aspect of this topic, e-mail me at the address below.
Reference XML JOURNAL LATEST STORIES . . .
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
|
SYS-CON FEATURED WHITEPAPERS MOST READ THIS WEEK BREAKING XML NEWS |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||