YOUR FEEDBACK
andy.mulholland wrote: intriguing !!! We have full scale 'Mashup Factories' in Chicago USA and Utrec...
AJAXWorld RIA Conference
Early Bird Savings Expire Friday Register Today and SAVE !..


2008 East
DIAMOND SPONSOR:
Data Direct
Frontiers in Data Access: The Coming Wave in Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
Intel
Virtualization – Path to Predictive Enterprise
Green Hills
IT Security in a Hostile World
JBoss / freedom oss
Practical SOA Approach
GOLD SPONSORS:
Software AG
The Art & Science of SOA: How Governance Enables Adoption
PlateSpin
Effective Planning for Virtual Infrastructure Growth
Fujitsu
Automated Business Process Discovery & Virtualization Service
Ceedo
Workspace Virtualization
Click For 2007 West
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TODAY'S TOP SOA & WEBSERVICES LINKS


Open Source Database Special Feature: An Introduction to Berkeley DB XML
Basic concepts, the shell commands, and beyond

Eager vs Lazy Evaluation
You may have noticed that after each query, BDB XML prints out the number of entries and the query evaluation method:

436 objects returned for eager expression

Eager is the default query evaluation method. Evaluating a query eagerly means that the BDB will store a result as soon as it finds any. In other words, eager evaluation grabs all of the results and stores them in a data structure, and they are available immediately after query execution. However, this is not the case when queries are evaluated lazily. In lazy evaluation, the database will not keep the results in a data structure. It will know how to get them (using pointers), but it will not do anything until the results are retrieved. Results are stored in sets. To get all of the results we have to iterate through the result set, using the next operator. This is what happens internally when we use the "print" command in the dbxml shell. It iterates through the entire set and gets every element of the result set. Thus, when queries are evaluated eagerly, the result set will be filled immediately after executing the query, as opposed to when the queries are evaluated lazily, and the result set either is empty or it has some of the results but definitely not all of them.

It may sound as though lazy query evaluation is never useful, but this is not the case. If you do not need all of the objects returned by the query, using lazy evaluation makes more sense. You can see this with the following query:

dbxml> setLazy on
Lazy evaluation on

dbxml> query 'collection("xbench.dbxml")/dictionary/e
[contains(. , "the hockey")]/hwg/hw'

Query - Starting query execution
Lazy expression 'collection("xbench.dbxml")/dictionary/e
[contains(. , "the hockey")]/hwg/h
w' completed

Note that execution time for this query is ignorable (there is no execution time info printed out by the database). That's because the actual results aren't retrieved yet. BDB XML will retrieve the results only after "print" command. We know that there are 436 objects returned by this query. Instead of getting all of the results, let's get only top eight of them. We can do this by using "print n 8" command.

XML Schema Validation
One of the new and cool features of the BDB XML is its ability to validate XML. First we need to create a container with XML Schema validation enabled. Listing 8 shows the XML sample (10MB XML sample with XML Schema, see the first entry in the References section) that I am going to put into this container.

This document is assigned an XML Schema. The part that shows this assignment is:

<dictionary xmlns:xsi=
"http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation=
"http://www.cs.umb.edu/~smimarog/
xmlsample/TCSD1.xsd">

This schema is located at www.cs.umb.edu/~smimarog/xmlsample/TCSD1.xsd.

XML Schemas and XML documents can be located on the same machine or on different machines. In this example, XML Schema and XML data are located on two different machines.

dbxml> createContainer validate_xbench.dbxml d validate
Creating document storage container, with validation

dbxml> openContainer validate_xbench.dbxml

dbxml> putDocument dict_10_valid C:\dictionary10_schema.xml f
Document added, name = dict_10_valid

A natural question is whether it's possible to add an XML document into this container without validating. Validation in Berkeley DB XML is very fast, which is a big time-saver. I have found that validating a document in BDB XML takes much less time than some commercial XML editors. However, it may be costly to validate each document when documents are huge. Besides, sometimes XML documents are not assigned to any schema. Listing 9 shows the XML sample (10MB XML sample in the References section) that I am going to put into this container.

Within this container we have two documents named dict_10_valid, and dict_10; the first document is validated, but the second is not. In some cases it's desirable to restrict queries to a specific document in the collection. We can achieve this by using the "doc" function.

dbxml> query 'doc("validate_xbench.dbxml/dict_10")//hwg'
733 objects returned for eager expression
'doc("validate_xbench.dbxml/dict_10")//hwg'

By saying doc("validate_xbench.dbxml/dict_10"), the queries are restricted to run on only the dict_10 document.

Indexing
Indexing XML documents is very important for good query performance. In fact, indexing XML data is literally the most important task for the user. There are limited automatic XML indexing features in BDB XML, but indexing is best done manually by the programmer. In this section I will introduce you to the basics of XML indexing in BDB XML. Here is the format of an index:

[unique]-{path type}-{node type}-{key type}-{syntax type}

An index in BDB XML is composed of four parts:

  • Path Types
  • Node Types
  • Key Types
  • Syntax Types
Uniqueness
Uniqueness indicates that the value being indexed is unique in the XML document. For example, in an employees data set, employee number will be unique, along with the social security number.
About Selim Mimaroglu
Selim Mimaroglu is a PhD candidate in computer science at the University of Massachusetts in Boston. He holds an MS in computer science from that school and has a BS in electrical engineering.

YOUR FEEDBACK
SYS-CON Belgium News Desk wrote: Open Source Database Special Feature: An Introduction to Berkeley DB XML. In this article I am going to introduce you to the latest version of the Berkeley DB XML, version 2.2.8. Berkeley DB XML (BDB XML) is built on top of the well-known Berkeley Database (BDB). BDB XML is an open source, native XML database. Like its ancestor, BDB, it's an embedded database. It provides APIs for the Java, C++, Perl, Python, PHP, and Tcl languages. It supports the popular XML query languages XQuery and XPath 2.0. I will show you how to use BDB XML in two ways. This month I will introduce the BDB XML shell, and next month we will explore using BDB XML with Java. BDB XML has a lot of features, and I will try to cover the most important ones.
SYS-CON Canada News Desk wrote: Open Source Database Special Feature: An Introduction to Berkeley DB XML. In this article I am going to introduce you to the latest version of the Berkeley DB XML, version 2.2.8. Berkeley DB XML (BDB XML) is built on top of the well-known Berkeley Database (BDB). BDB XML is an open source, native XML database. Like its ancestor, BDB, it's an embedded database. It provides APIs for the Java, C++, Perl, Python, PHP, and Tcl languages. It supports the popular XML query languages XQuery and XPath 2.0. I will show you how to use BDB XML in two ways. This month I will introduce the BDB XML shell, and next month we will explore using BDB XML with Java. BDB XML has a lot of features, and I will try to cover the most important ones.
SYS-CON Germany News Desk wrote: Open Source Database Special Feature: An Introduction to Berkeley DB XML. In this article I am going to introduce you to the latest version of the Berkeley DB XML, version 2.2.8. Berkeley DB XML (BDB XML) is built on top of the well-known Berkeley Database (BDB). BDB XML is an open source, native XML database. Like its ancestor, BDB, it's an embedded database. It provides APIs for the Java, C++, Perl, Python, PHP, and Tcl languages. It supports the popular XML query languages XQuery and XPath 2.0. I will show you how to use BDB XML in two ways. This month I will introduce the BDB XML shell, and next month we will explore using BDB XML with Java. BDB XML has a lot of features, and I will try to cover the most important ones.
SYS-CON UK News Desk wrote: Open Source Database Special Feature: An Introduction to Berkeley DB XML. In this article I am going to introduce you to the latest version of the Berkeley DB XML, version 2.2.8. Berkeley DB XML (BDB XML) is built on top of the well-known Berkeley Database (BDB). BDB XML is an open source, native XML database. Like its ancestor, BDB, it's an embedded database. It provides APIs for the Java, C++, Perl, Python, PHP, and Tcl languages. It supports the popular XML query languages XQuery and XPath 2.0. I will show you how to use BDB XML in two ways. This month I will introduce the BDB XML shell, and next month we will explore using BDB XML with Java. BDB XML has a lot of features, and I will try to cover the most important ones.
XML News Desk wrote: Open Source Database Special Feature: An Introduction to Berkeley DB XML. In this article I am going to introduce you to the latest version of the Berkeley DB XML, version 2.2.8. Berkeley DB XML (BDB XML) is built on top of the well-known Berkeley Database (BDB). BDB XML is an open source, native XML database. Like its ancestor, BDB, it's an embedded database. It provides APIs for the Java, C++, Perl, Python, PHP, and Tcl languages. It supports the popular XML query languages XQuery and XPath 2.0. I will show you how to use BDB XML in two ways. This month I will introduce the BDB XML shell, and next month we will explore using BDB XML with Java. BDB XML has a lot of features, and I will try to cover the most important ones.
XML JOURNAL LATEST STORIES . . .
Intel, a leader in silicon innovation, develops technologies, products and initiatives to continually advance how people work and live. Intel XML Software Products help enhance productivity of XML and SOA application development by providing comprehensive, high performance XML processi...
The one thing that unifies the distributed computing style known as SOA, in most of its manifestations, is self-describing data via the Extensible Markup Language (XML). The benefits of XML over opaque message formats in data interchange are well established. No matter if your focus is...
Since its emergence, Web Service technology has gone a long way towards perfecting itself and finding its right application in the real world. With the maturity of the specifications, Web Service technology, with its power of interoperability, is now the major enabling technology of SO...
Join Scott Guthrie as he discusses Microsoft’s commitment to web standards development, Rich Internet Applications and how Microsoft is contributing to help move the web forward. Join Adobe’s Kevin Lynch as he demonstrates how Flash and HTML come together to make the most engaging,...
In a snit that Microsoft was able to push its OOXML file format through to ISO standardization, IBM, a big backer of the OOXML-opposing ODF file format, has instituted a new corporate policy that suggests it will pull out of standards bodies whose rules don’t conform to what it think...
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON FEATURED WHITEPAPERS


ADS BY GOOGLE
BREAKING XML NEWS
SOA Software, a leading Integrated SOA Governance Automation vendor, today announced that it has exp...