| By Sandeepan Banerjee | Article Rating: |
|
| June 28, 2005 12:00 PM EDT | Reads: |
9,561 |
Two somewhat contrary-sounding drivers fuel the emerging renaissance in enterprise data management - virtualization and convergence. Virtualization is a framework for dividing up the resources of an organization into multiple execution environments through the application of one or more technologies such as hardware clustering, software partitioning, application modularization, emulation, and so on. Convergence, on the other hand, tries to bring diverse information assets - databases, mail stores, documents - under unified management. The coming Information Grid unites these opposing drivers.
The drive behind virtualization is the lowering of cost. Today's emerging grid computing environments enable not only the virtualization of IT resources such as storage, bandwidth, CPU cycles (supporting the ad hoc provisioning, on-demand deployment, and decentralized management of the resources), but also allow looser couplings between applications and modules, which are no longer assumed to be monolithic clients and servers. Loosely coupled applications will run different modules on different nodes of a virtualized IT fabric, invoke functionality from remote Web services, exchange self-describing marked up data, and orchestrate the behavior of diverse process modules. XML technologies underpin loosely coupled grid-computing applications. Within the data center, the first generation XML Web services-based service-oriented architectures (SOAs) are already in development.
Convergence, on the other hand, seeks to bring together the management of all of your data assets. Today, less than 10 percent of the world's information is managed, and most of what is found to be valuable to manage - capture, store, index, search, analyze, share, and repurpose - falls into the category of traditional rows and columns such as structured data. Being able to manage the remaining data is what convergence is all about. Here again, XML technologies underpin the renaissance. In XML we finally have a data model that is capable of addressing highly structured data (rows and columns), textual unstructured data (documents), and anything semi-structured in between (messages, template-based business data documents, or metadata). Document-intensive industries are already benefiting from standardizing their document formats on XML. Content-creation vendors are XML-enabling their tools to make it easier to capture information in content repositories. Vendors are XML-enabling business intelligence tools, application servers, enterprise portals, and other infrastructure products to make it easier to share and repurpose XML-based information.
The real driver behind convergence is better business intelligence across all assets. When unstructured information becomes a managed resource, it can be integrated into more day-to-day organizational processes, such as search and compliance, which are really types of business intelligence. Users can search across information that was previously stored in silos, such as file systems, document repositories, Web sites, and e-mail. Collaborative processes can be automated. Compliance policies - privacy, information life-cycle management, and audit - can be implemented uniformly across all organizational assets.
XML's applicability to both virtualization and convergence allows the industry to make progress on both fronts without the need for multiple disruptive paradigm shifts. Moving toward a new data-management architecture based on XML-backed information repositories distributed across XML/SOA fabrics will be a key future step for organizations. This architecture, which combines virtualization and convergence, can be called the Information Grid.
The Information Grid and Its Components
Grid computing can virtualize any IT resource, including infrastructure, applications, and information. In the Information Grid, resources span all of the data in the organization, as well as all of the metadata required to make that data meaningful. This data may be structured, semi-structured, or unstructured; stored in any location, such as databases, local file systems, or e-mail servers; and created by any application. The vision for the Information Grid builds on technologies such as semantics, distributed query, and distributed data management. The goal is to enable organizations to view all of their assets in a smooth continuum, from the Internet to the intranet, with uniform, semantically rich access.
Application Grid vs. Information Grid
Within an Application Grid, individual modules run on different parts of the infrastructure, with sharing of application state and control enabled via Web services. Each module, however, may be still tightly coupled to its data - database, file-system, e-mail server - and intelligence about the data has to be compiled into the application module. An Information Grid, in contrast, is self-describing: the application modules can discover what sources exist, what data they possess, what the life cycle of that data is, and how that data should be interpreted. The Information Grid builds on the Infrastructure and Application Grids.
Let's say a manufacturing organization is interested in tracking product defects. The defect reports come into the organization in a variety of ways - customer e-mail, news stories, phone calls to support centers, and so on. At a pure application level, the organization could build e-mail-analysis, RSS-feed-search, or CRM defect-tracking modules to be dispatched across the grid, with each module hardwired to analysis of exactly one kind of data. However, if new kinds of defect reports occur with unpredictable frequency (suddenly Internet blogs become a major source of defect information), then modules that are hard coded to a particular kind of data are proven to be fragile, and the Application Grid is not successful. An Information Grid where the defect reports can describe their own meaning, and modules interact with the defect reports to understand their semantics, appears to be more flexible. The following are the components of the Information Grid.
Infrastructure Provisioning and Failover
What are the major components of the Information Grid? At the very basic level, any grid involves the virtualization of resources. Infrastructure Grid resources include hardware resources such as storage, processors, memory, and networks, as well as software designed to manage this hardware, such as databases, storage management, system management, application servers, and operating systems. Provisioning of infrastructure resources involves pooling the resources together and allocating to the appropriate consumers based on policies. For example, one policy might be to load-balance processing power across a farm of Web servers depending on the amount of processing demanded by each, thus treating the overall processing resource as a single pool and allocating that resource through supply and demand. In addition to the cost savings that accrue from better overall CPU utilization, the spreading of computing capacity among many different computers or spreading storage capacity across multiple disk groups removes single points of failure.
Published June 28, 2005 Reads 9,561
Copyright © 2005 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Sandeepan Banerjee
Sandeepan Banerjee is director of product management in Oracle's Server Technologies division. He is responsible for SQL, XML, and Text Search infrastructure, and especially their convergence into one platform for all data. Sandeepan has worked with database technologies for over 15 years, and the majority of them have been with Oracle.
![]() |
Sandeepan Banerjee 06/28/05 12:46:11 PM EDT | |||
The Information Grid - XML and Databases Moving Toward Convergence. Two somewhat contrary-sounding drivers fuel the emerging renaissance in enterprise data management - virtualization and convergence. Virtualization is a framework for dividing up the resources of an organization into multiple execution environments through the application of one or more technologies such as hardware clustering, software partitioning, application modularization, emulation, and so on. Convergence, on the other hand, tries to bring diverse information assets - databases, mail stores, documents - under unified management. The coming Information Grid unites these opposing drivers. |
||||
- Cloud CEOs, CTOs & SVPs to Speak at 4th International Cloud Computing Expo
- Will PR Firms Survive The New Media Avalanche?
- Publishing Synergy: Blog, Twitter and Ulitzer
- Typhoon Ondoy (Ketsana) Hits the Philippines (Part 2)
- Combining the Cloud with the Computing: Application Delivery Networks
- SOA World Magazine’s 8th Annual "Readers' Choice Awards" Nominations Open
- Confessions of a Ulitzer Addict
- My Thoughts on Ulitzer
- Ulitzer vs. Ning
- Orchestration in the Cloud to Manage Lower Operational Costs
- AJAX World RIA Conference & Expo Kicks Off in New York City
- Sun Federal's Dr Harry Foxwell to Speak at 1st Annual GovIT Expo
- Cloud CEOs, CTOs & SVPs to Speak at 4th International Cloud Computing Expo
- Ted Weissman and Lois Paul & Partners PR Firm
- Will PR Firms Survive The New Media Avalanche?
- Publishing Synergy: Blog, Twitter and Ulitzer
- Improving the Efficiency of SOA-Based Applications
- Typhoon Ondoy (Ketsana) Hits the Philippines (Part 2)
- SOA, BPM, CEP: Getting IT Budget in a Tight Economy
- Combining the Cloud with the Computing: Application Delivery Networks
- Where Are RIA Technologies Headed in 2008?
- AJAX World RIA Conference & Expo Kicks Off in New York City
- JSON vs XML - A Jason vs Freddie Sequel
- Processing XML with C# and .NET
- Has the Technology Bounceback Begun?
- BPEL Processes and Human Workflow
- Open Source Database Special Feature: An Introduction to Berkeley DB XML
- "HP's Problem Ain't the SAP Install," Says Sun's Schwartz
- eXist - An Introduction To Open Source Native XML Database
- Digitizing the Planet: Google Earth vs MSN Virtual Earth vs MapQuest



































