|
YOUR FEEDBACK
Did you read today's front page stories & breaking news?
SYS-CON.TV |
TODAY'S TOP SOA & WEBSERVICES LINKS Feature Java Feature — JDBC 4.0
XML, performance, and more
Nov. 29, 2006 11:00 PM
It's been over three years since the JDBC Expert Group held its first meeting to gather requirements, requests, and pipe dreams for the JDBC 4.0 specification. In that meeting, we discussed a wide variety of topics, including performance enhancements, clarifications on the existing JDBC 3.0 specification, and Ease of Development features. Unbelievably, everything but the kitchen sink ended up making it into the release. In this article, we'll look at several key features that made the enhancement list for JDBC 4.0, and we'll discuss why those features are important.
XML Support Today, applications must use either JDBC driver extensions or the Clob and Blob interfaces to transfer XML data to or from the database. Using JDBC driver extensions makes it hard to write a portable database application. Using the Clob and Blob interfaces limits the application to working with string representations of XML data and, in many cases, requires vendor-specific extensions to the SQL grammar to tell the database whether to return the data as a character or binary representation of the XML string data. Now, the new JDBC data type, SQLXML, is part of the JDBC 4.0 specification. Applications can use the getTypeInfo() method to determine if their database supports a native XML data type. For example, using getTypeInfo() against a Microsoft SQL Server 2000 instance does not return a result row corresponding to the SQLXML data type, indicating that there is no native XML data type available for that particular database. In contrast, using getTypeInfo() against a Microsoft SQL Server 2005 instance returns a result row, indicating that an XML data type is available. Additionally, it returns information indicating that the native type name is XML. From this information, applications can create tables that contain columns of the XML data type. To allow applications to populate data into XML columns and retrieve data from those columns, JDBC has been expanded to include native Java bindings for XML. In the initial public draft, the JDBC Expert Group defined bindings for Java strings and StAX. The thinking at the time was that this subset of XML representations could be used to easily construct other XML representations. However, feedback from the community made it clear that there are many applications today that rely on DOM and SAX and that any JDBC XML solution must provide support for those and other XML representations. The expert group went back and totally reworked the SQLXML interface definition. The definition of the SQLXML interface in the proposed final draft of the JDBC 4.0 specification now includes support for generating and retrieving a character or binary representation of XML data as a Java String, a character stream, or a binary stream. More importantly, the SQLXML interface includes methods for working with the Source and Result interfaces defined in the javax.xml.transform package. These methods provide flexibility for supporting any XML representation for which there is a javax.xml.transform Source or Result implementation. Additionally, supporting the Source and Result interfaces allows a JDBC driver to easily become an end point of XSLT transforms or XPath evaluations. At a minimum, the JDBC 4.0 specification requires JDBC drivers to support the Source and Result interfaces shown in Table 1. To create a Java construct that can be used to process XML data, an application can create a SQLXML object using the Connection.createSQLXML() method. The object that is created does not contain any data initially. Data is added to the object using one of the following methods:
Similarly, applications can retrieve XML data from a SQLXML object using the getString( ), getCharacterSteam( ), getBinaryStream( ), and getSource( ) methods. Listing 3 illustrates how an application can query a column of the SQLXML data type, create a SQLXML Java binding using the getSQLXML() method on the result set, and retrieve a StAXSource object that is used to process the XML data. SQL 2003 also includes extensions to the SELECT syntax that allow you to construct XML results from tabular columns. Listing 4 shows a simple example of how to create a SELECT statement that produces a result set containing two columns: a CustId column of type integer and a CustInfo column of type SQLXML. The SELECT statement uses the new SQL/XML extension XMLELEMENT to process multiple base columns into a single XML result column. JDBC 4.0 also has been expanded to support using database metadata methods to determine which SQL/XML constructs are supported on the connection. Applications can execute any supported SELECT statement with SQL/XML extensions to produce SQLXML result columns that can be processed using the new XML Java bindings.
Connection and Statement Pooling Enhancements The architecture of many Java application servers or Web servers dictates that database interaction occurs as a result of an incoming message, a user clicking a button in a Web browser, or through some other real-time event. In each of these occurrences, a database connection is usually established, one or more SQL statements are executed, results are processed, and the connection is closed. In this type of architecture, the response time for the application is limited by the response time of the connection attempt. That is, connecting to a database is one of the most performance-expensive operations that a JDBC application can do. A connection involves multiple network round-trips between a JDBC driver and a database server to perform the following actions:
Connection pooling works great until there is a problem you need to investigate. When the response time of your database queries takes minutes instead of milliseconds, your application server suddenly starts to run out of memory or CPU cycles and your database appears to be "hung." Another possibility is that when you try to monitor the status of your applications, you find that "some JDBC connection" is using all the CPU and that the facilities available to help you find the culprit are not very good. Once a JDBC connection is established, the tracking mechanism between the physical connection and the application's use of the logical connection is lost. The connection pool manager assigns physical connections in the pool to any application that meets authentication requirements; the pool manager does not keep any statistics on the application requesting a connection, and the connection itself is a black box to the application. In other words, if you are using a monitoring tool and see that a JDBC connection is "bogging down the system," it's not possible to track down which JDBC application is actually invoking the driver. As we stated earlier, connection pooling is usually provided by the application/Web server, so a connection request from an application is not sent to a driver, but is instead sent to the pooling component inside the server. For a JDBC driver to associate an application to a connection, it must be involved in the connection establishment process, which does not happen when using JDBC 3.0-compliant connection pool managers. JDBC 4.0 has added the setClientInfo() and getClientInfo() methods to the connection interface to solve many of the problems mentioned above. After connecting, an application can call setClientInfo() to associate client-specific information to the JDBC connection object, such as application name, site name, and department name for the JDBC connection. The setClientInfo() request is executed in the JDBC driver and not in the connection pool manager. The JDBC driver then passes along this information to the database server. In this way, monitoring tools can retrieve client information for specific database connections from either the JDBC driver or from the database server to help pinpoint where problems are occurring. Another problem we often see in today's popular JDBC connection pool implementations is that there is no good way for a connection pool manager to determine if a database connection has become unusable. Typically, if a pool manager detects that any single connection in the pool has become invalid, the pool manager terminates all connections in the pool regardless of whether they're usable. After the pool is flushed, the pool manager re-initiates the pool with new connections. Flushing the pool is a drastic process that results in the potential loss of business logic, poor performance, and, typically, irate users. Some pool managers erroneously use the Connection.isClosed() method to check the state of a database connection, but the intention of isClosed() is to check if a connection is open or closed, not to determine if the connection is still usable. A new method has been added, Connection.isValid(), to allow pool managers to specifically request from the driver if a connection is still usable. If a connection is invalid, the pool manager can discard only the marred connection rather than the contents of the entire pool. In addition to the connection pooling mechanism previously discussed, JDBC 3.0 introduced a statement pooling mechanism to cache prepared statements. The statement pool manager, which can be part of the JDBC driver or part of the application/Web server, keeps prepared SQL queries in a cache that can be reused by applications. In the same way that a connection pool manager keeps performance-expensive database connections in a pool for "loan," the statement pool manager keeps SQL prepared queries in a cache that can be loaned out when an application attempts to use a SQL query that matches one in the pool. JDBC statement pooling provides performance gains for JDBC applications that execute the same SQL statements multiple times in the life of the application. Most applications have a certain set of SQL statements that are executed multiple times and a few SQL statements that are executed only once or twice during the life of the application. Unfortunately, existing JDBC statement pooling implementations give no weight to a SQL statement that's executed 100 times versus one that's executed only twice. Either a statement goes into the pool, potentially displacing another statement from the pool, or there is no pool. JDBC 4.0 provides a more granular level of statement pooling by allowing applications to provide directives to the pool manager about whether a SQL statement should be pooled. The PreparedStatement interface has been expanded by the addition of two new methods: isPoolable() and setPoolable(). The isPoolable() method returns a Boolean flag that denotes whether the SQL statement identified on the PreparedStatement object should be pooled (by default, a statement is poolable when it's created). Applications specifically can request that a statement not be pooled by calling setPoolable(false). Using these constructs, application architects gain more control over the performance aspects of their JDBC applications. Queries that are reused are pooled and provide optimal performance, and queries that are used infrequently do not affect the pool.
National Character Set Support It turns out that most JDBC drivers do need to know when sending character data to a database if the type the database server is expecting is a Unicode type (or National Character). It also turns out that most JDBC drivers can't figure out what the database is expecting without expensive network round-trips to the database server. JDBC provides only one type of binding using setString(), setCharacterStream(), and setClob(). If a SQL parameter corresponds to an NCHAR type, the application binds the parameter using setString(). Similarly, if the parameter corresponds to a CHAR type, the application uses setString() also. To compensate for this deficiency, JDBC drivers typically adopted one of the following three strategies:
Other New Features One of the innovations removed late in the specification process were the Ease of Development features, including annotation support. Look for annotation support soon after the release of Java SE 6.0. Take a good look at the JDBC 4.0 specification at www.jcp.org/en/jsr/detail?id=221 for details about the features we've mentioned in this article as well as all the features we didn't have space to include. As you start to use JDBC 4.0, remember that some of the new features are targeted for JDBC pooling components usually available in an application server or Web server while others are targeted for JDBC drivers. Check out the components you're using to make sure that they support the parts of the specification you're interested in and start reaping the benefits of JDBC 4.0. YOUR FEEDBACK
XML JOURNAL LATEST STORIES . . .
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
|
SYS-CON FEATURED WHITEPAPERS MOST READ THIS WEEK BREAKING XML NEWS |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||