|
|
YOUR FEEDBACK
SOA World Conference
Virtualization Conference $200 Savings Expire May 16, 2008... – Register Today! Did you read today's front page stories & breaking news?
SYS-CON.TV |
TODAY'S TOP SOA & WEBSERVICES LINKS Data
Complete Data Integration Through XQuery
Vastly simplifing SOA implementations
By: Jonathan Robie
Dec. 14, 2006 10:00 AM
Digg This!
Most businesses have an urgent need for up-to-date, accurate information based on data from multiple data sources. It would be much easier if all your data were stored in one database so it can be queried as a whole, but this is rarely practical. In the real world, data integration is required. You need a simple, efficient way to query data found in various data sources.
For instance, you may want to generate one report with the overall status of a customer, or you may want to find all customers with outstanding tech support issues who are deciding whether to make a major purchase this quarter. If all of your data were in a single database, you would retrieve the information with a simple query. Because the data is in many different sources, you have to write a good deal of code to get the same result, and the code is quite different for each data source. This is time-consuming, error prone, and complicates security and auditing. With XQuery, you query each data source as though it were XML, no matter how the underlying data is physically stored. XQuery is the World Wide Web Consortium (W3C) standard XML query language, designed for both XML processing and data integration. Using XQuery for data integration vastly simplifies SOA implementations, making your developers more productive and improving the performance of your systems. An XML Integrated Development Environment (IDE) that supports XQuery makes it much easier for you to visualize data sources, generate and test queries, and debug. The queries you develop can be exposed via a data access layer, which is accessed using SOAP or HTTP, so that they can be reused in different SOAP message formats or in other applications.
XQuery Simplifies Data Integration One frequently used expression in XQuery, the FLWOR expression, is similar to SQL's SELECT-FROM-WHERE. Because XML structures are more complex than SQL tables, XQuery provides path expressions that can identify any item in an XML structure. To create structures in query results, it also provides constructors, using a syntax that looks like the XML to be constructed. A typical XQuery might use path expressions to locate data, FLWOR expressions to perform joins and combine data, and constructors to create the structures of the query result. These tasks are much more tedious with conventional programming languages. For instance, to achieve the same result with the Java DOM API, this would require parsing, navigating object structures, casting values from XML into Java data types, creating a result tree structure, and appending nodes to that result tree. In general, conventional programming languages require seven to 20 times more code than an equivalent XQuery. Not only are XML applications harder to write in conventional programming languages, performance can be much better in a good XQuery implementation, because XQuery is a declarative language that allows the implementation to do many useful kinds of query optimization. The second way XQuery simplifies data integration is by eliminating the need to work with different APIs and data models for each data source. The XQuery language is defined in terms of XML structures, but since almost any data can be mapped into XML structures, an XQuery implementation can use XQuery to query just about anything. For instance, an XQuery implementation can provide support for relational data, implementing queries by generating efficient SQL for the database, but allowing a user to query the data as though it were XML. By treating all data sources as XML, this kind of XQuery implementation lets a developer query relational data, Web message calls, and other data sources together, with a small amount of declarative code, in one uniform data model, without mastering the idiosyncrasies of each system. Consider the customer example in the introduction. With an XQuery implementation that supports all of the underlying data sources, a developer can write a simple query to do a join among the different systems that represent different aspects of a customer. This dramatically simplifies software development in most business environments. The developer focuses on the information that's needed, not on the representation used in each system. Typically, the code savings in data integration environments is even greater than in pure XML environments. The available data sources and the implementation strategy vary widely among XQuery implementations. For relational data, an implementation may translate an XQuery into SQL then translate the SQL result sets to XML when returning results to the query engine. For flat file formats, an implementation can provide XML converters that actually convert data to XML on-the-fly when it's queried. Web Service calls may be supported using functions that can be called from within a query. When choosing an XQuery implementation, make sure that it fits in your computing environment and can handle the data sources needed in your architecture. The XQuery implementations from most database vendors are designed to query only data stored in their database; most companies have more than one database, and data not found in a database. The XQuery implementations from application server vendors or XML integration server vendors can query a wider range of data sources, but require the adoption of their server, which may not fit in your architecture, or may increase the footprint of the system. If you're writing Web Services in a Java environment, make sure your implementation supports the XQuery API for Java (XQJ), which is the standard Java interface for XQuery - it lets your servlets use XQuery the same way that JDBC lets servlets use SQL. Also, the performance of XQuery implementations varies dramatically - make sure that you test performance for the data you work with, especially if you're using XQuery for relational data or very large XML files. Because XQuery is declarative and can be optimized, a good implementation will provide performance better than you normally achieve with hand-coded Java, JDBC, SQL, and an XML API. Using XQuery vastly simplifies data integration, offering loosely coupled access, and providing one way to query any data source supported by the query engine. And because an XQuery implementation can talk directly to the original data source, it can do optimizations that are no longer available once the data is extracted and converted to physical XML. As a result, what is easier for the developer also results in better performance.
XML Development Environments for Data Integration When choosing an IDE for data integration with XQuery, consider related functionality that you may need. For instance, some IDEs also provide support for developing XML pipelines and publishing. Several IDEs can generate XQJ code to run an XQuery as part of a program. One XQuery IDE is implemented as an Eclipse plug-in, which is very convenient for Java developers who use Eclipse. Several IDEs also provide good support for writing and testing XSLT stylesheets, W3C XML Schemas and DTDs, and related XML development. The Data Access Layer In most companies, several data consumers need to access the same information. For instance, if one of your Web Services needs a description of a customer, this same description might also be useful for other Web Services, and also for dynamic Web sites, AJAX clients, publishing applications, or any other application that needs customer data. Frequently companies design for a single project, coding very similar interfaces for each data consumer, an obvious waste of programming effort. And if the data sources change, each of these interfaces has to be rewritten. In environments where security and auditability are important, much more code must be audited. A data access layer lets many data consumers access data using the same well-defined interface. For each request, the data access layer calls a data service. Data services should represent the business model, hiding underlying systems and the data integration task from data consumers. For instance, you might write a data service that provides the data for a single customer. A data service can be parameterized - a parameter might identify the customer ID or the name of a particular view of the customer. Many data services do nothing more than query data from one or more data sources to produce XML. These data services can be written directly in XQuery, using external variables to allow queries to be parameterized. In other data services, an XQuery may be part of a Java program that performs business logic or interacts with other systems, or it may be part of an XML pipeline. A small focused team can be responsible for writing the queries to implement data services, and for documenting available services, allowing data consumers to access these services using standard Web and XML interfaces.
Summary Because businesses need up-to-date information that comes from a variety of data sources, but the proper tools and development methods have lagged behind, today's software systems are often needlessly complex and ad hoc. Modern data integration tools are the solution. Using XQuery, an XML IDE, and a data access layer simplifies development significantly, improves performance, increases code reusability, and makes systems more maintainable. XML JOURNAL LATEST STORIES . . .
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
|
SYS-CON FEATURED WHITEPAPERS MOST READ THIS WEEK BREAKING XML NEWS
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||