YOUR FEEDBACK
Kyle Simpson wrote: Uhh, how exactly is this really at all different from flash and externalinterfac...
Cloud Computing Conference
March 30 - April 1, New York
Register Today and SAVE !..


2008 East
DIAMOND SPONSOR:
Data Direct
Frontiers in Data Access: The Coming Wave in Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
Intel
Virtualization – Path to Predictive Enterprise
Green Hills
IT Security in a Hostile World
JBoss / freedom oss
Practical SOA Approach
GOLD SPONSORS:
Software AG
The Art & Science of SOA: How Governance Enables Adoption
PlateSpin
Effective Planning for Virtual Infrastructure Growth
Fujitsu
Automated Business Process Discovery & Virtualization Service
Ceedo
Workspace Virtualization
Click For 2007 West
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TODAY'S TOP SOA & WEBSERVICES LINKS


XSLT on Wall Street
XSLT on Wall Street

Since its inception, XML has gained a strong foothold on Wall Street, but the use of XSLT for financial applications has been selective. This article reviews a real-world case study involving XSLT (and other "new" technology tools) that led to impressive business results.

Our technical solution included the development of a complete rules engine in XSLT, XML parsing in Java using DOM/SAX, a persistent store of XML data within a relational database (Oracle), a customizable report writer (Cognos), and a complete runtime environment in Unix. A two-person team completed the project work over a period of 12 weeks - an incredible feat, and one that would never have been possible with "traditional" programming paradigms. XML, XSLT, and Java technologies are worthy of your attention; read further to see how they helped us to deliver an innovative solution for one of our important clients on Wall Street. Project Overview
A large Wall Street financial services firm developed a powerful quantitative model for valuing convertible bonds for use by its traders, risk managers, and accounting department. The convertible bond asset class is a hybrid of fixed-income and equity instruments and presents a challenge for data manipulation. Convertible bonds can have hundreds of fields of reference data (e.g., coupon, maturity, conversion price, conversion rate, currency, par amount), with many embedded time elements (e.g., call schedule, put schedule).

Although many vendors supply convertible bond reference data (e.g., Bloomberg, Reuters), our client was unwilling to accept any single vendor's data feed as the sole basis for its model inputs. The firm needed to normalize the structure of the convertible bond data elements and account for differences by each vendor in feed formats and in nomenclature for describing convertible bond reference data (i.e., field names). By transforming the data elements from each vendor into a standardized database structure, the firm created a consistent framework to select inputs for its valuation engine.

Prior to our solution, the client was trying to manage the large volume of data feeds through a manual comparison process - an onerous task given the quantity of data. The comparison logic for the data had to be heuristic and learn to suppress previously tagged discrepancies so that daily reporting wouldn't include repeat offenders.

Our project was intended to fulfill several key business objectives:

  • Process data feeds from multiple vendors.
  • Normalize the incoming data feeds to a common schema using a complex set of rules.
  • Enable users to transform source data using complex and changing business rules - without involving subsequent IT programming resources.
  • Store normalized data into a persistent data store.
  • Enable users to create reports that compared data from multiple sources.
  • Automate the manual data entry process.

We developed many shared classes and utilities, leveraging the solution to meet additional client requests to transform related data, generate feeds, and provide automated ways to populate their internal databases.

This article provides an overview of the project and describes the mechanics of that effort, including business and functional requirements, the detailed technical solution, and its rationale. I've included some sample code to highlight key methods we employed to meet these critical business needs for our client.

Background on Convertible Bonds
Bonds are fixed-income securities that represent the debt of domestic and international governments, corporations, banks, institutions, or municipalities. When purchasing a fixed-income security, an investor is lending money to the issuer for a specified period of time. In return, the investor (i.e., lender) receives regular interest payments (i.e., the coupon) and later the bond face value on the maturity date. The fixed-income market offers a wide range of asset classes with varying degrees of risk. The credit ratings from independent rating groups like Moody's Investors Service and Standard & Poor's are intended to rank the relative credit worthiness of these securities.

Convertible bonds are hybrid instruments that offer a fixed-income component (i.e., a coupon) and an option to convert the bond into the issuer's underlying equity at a predetermined conversion price and ratio. Convertible bonds are influenced by interest rates, credit risk, underlying equity price, market volatility (due to the embedded equity option), and other market factors. Sometimes even New York City subway noise can affect convertible bond valuation! The convertible bond asset class is an exemplary illustration of Wall Street innovation, particularly as bonds are currently issued with many interesting features. Hedge funds and other investors have played an important role in influencing the valuation and hedging of convertible bonds. Solution Overview
Briefly, our solution consisted of the following modules:

  • Job scheduling framework in Unix to invoke the XSLT rules engine upon delivery of vendor feeds
  • Utility classes to transform non-XML feeds to XML format
  • XSLT business rules definitions for transforming incoming data feeds into normalized schema
  • Java programs to parse data feeds and invoke XSLT business rules
  • Java programs to store transformed data feeds into Oracle
  • Oracle database to store normalized convertible bond data elements for each vendor
  • Cognos reporting framework for creating customized reports using the Oracle database
  • Utility classes used as extensions in XSLT transformations
System Design Considerations
The functional and reporting framework requirements drove the main system design considerations.

Functional Requirements

  • Rules engine should be extensible, accommodating input file changes automatically and consolidating data from multiple files per vendor.
  • Rules engine should be independent of nomenclature used by the feed provider.
  • Business rules grammar should be easy for the end user to learn, develop, test, and deploy into the production environment.
  • Persistent data store should enable extraction of data in XML or other formats.

Reporting Requirements

  • Define reports to compare convertible bond data from different sources.
  • Define reports to identify the history of changes in data fields from a single-source vendor.
  • Create a report to identify new or recently terminated/expired convertible bond issues.
  • Allow downloading of report data into various formats (Excel, XML, and text files).

Why XML? With the flexibility expected from the rules engine, XML is the obvious choice. XML doesn't have the baggage of fixed formatted files. By handling XML, the rules engine is ignorant of both source feed formats and the nomenclature used to define indicative data. (Also, we needed an excuse to get published in XML-Journal!)

Why XSLT? One of the core requirements of the rules engine was to allow the end user to maintain the rules engine. This required a grammar that was easy to learn and at the same time had the flexibility to handle data spanned in numerous data files with different nomenclature for attributes. XSLT was the hands-down choice; it's a declarative language (like SQL) and provides mechanisms for handling XML data spanning multiple files. XSLT also allows inclusion of Java functions in defining rules (a very powerful feature to handle more complex rules).

Why Java? We chose Java as the programming language for its platform independence and the availability of XML public domain parsers.

Why Oracle? We chose the Oracle database for its rich XML/XSLT parser API (see Figure 1) and its database-level tools for extracting XML data. The XSLT engine from Oracle has the flexibility and support for using extended Java functions in XSLT rules. Oracle's XML-SQL utility API provides easy ways to store and extract XML documents from a relational database structure. The "oraxsl" tool is especially useful for quick testing of business rules.

Why Cognos Impromptu? For the reporting tool, we selected COGNOS Impromptu, which provides an interactive reporting framework with numerous built-in functions and flexibility to save reports in multiple formats (e.g., CSV, HTML, PDF, XLS). The Cognos reporting tool also contains a job scheduler and allows the report to be saved as an SQL query, which can be used to extract the contents of the report as well-formed XML data.

Rules Engine Architecture
Figure 2 provides a high-level architecture and data flow diagram of the rules engine. The rules engine consists of (1) numerous Java modules to parse XML files using a combination of SAX/DOM parsers, (2) an XSLT transformation engine to apply business rules in XSLT to transform feeds into normalized XML Schema, and (3) an Oracle XML SQL utility to move transformed data into a persistent data store.

The rules engine provides the option of storing the output to a database, a file, or both. The engine's ability to store the output to a file is exploited in handling large feed files (explained later). In addition, the rules engine can be configured to process a subset of convertible bond instruments rather than all the available instruments on the feed file (an important feature in using memory-constrained DOM parsing).

Once the rules engine stored the normalized data into a relational structure, the business user analyzed the quality of the data and selected the "best" data source for use in the financial models. The XML generator utility provided additional tools for creating XML output from saved SQL queries and/or text files.

Key System Modules
Converting Text Files to XML
We developed a utility class to convert the data contained in delimited text files to XML format. The premise of this utility was simple: the first line in the text file would contain the field names to be used as attribute tag names, while all the other lines contained the values for those fields. The utility built each record in the text file as a row element, with all data from each row as attributes of the row element. The utility provided flexibility in defining the root node and rows of the XML file as parameters within a configuration file. The text file delimiter was also defined as a configuration file parameter, which meant that a file with any delimiter could be processed. We also created two helper classes: one that abstracted out the parser and the other that had utility functions for creating/manipulating XML messages.

Handling Large XML Files
The most important technical aspect of the rules engine is its ability to handle very large XML files - over 100MB. These large files typically contained data for more than 10,000 convertible bond instruments, with each instrument having over 200 attributes. We adopted the divide and conquer strategy to address the constraints of loading the entire file into memory for DOM parsing. The XML splitter (based on the SAX parser) was used to obtain a subset of instruments to build a document object and subsequent XSLT transformation. To reduce the file size, we extracted only relevant field attributes from the input file using selector XSLT rules. (Figure 3 provides a schematic representation of this process.)

Transforming and Saving Results to a Persistent Store
The XML transformer served as the main module, coordinating the activities of the rules engine and creating an instance of the XML splitter to parse the XML file into smaller subsets. With each extracted subset, the transformer used the XSLT processor to apply the business rules. Upon transformation, the parsed XML data was saved into the database using the Oracle XML SQL utility API. (Figure 4 illustrates this process.) With careful monitoring of the performance of the system, we determined an optimum batch size for each of the feeds. We created intermediary XML documents for each of the input files from the vendor and used XPATH to access content from these imported documents in the main module.

Business Rules - XSLT
After carefully reviewing the various data and working with the business users, we created an initial set of business rules for each data source. From the outset we organized the XSLT rules in layers to promote modularity and to enable the reuse of generic rules common to multiple sources of data. We created a template for each rule with guidelines for variable names, formatting, and rules logic.

The following are examples of the business rules:

  • Rule to translate the coupon frequency: Replace numeric data 1, 2, 4, and 12 with "A" for annual, "S" for semiannual, "Q" for quarterly, and "M" for monthly, respectively. There were many such "code/decode" rules in the engine (see Listing 1).
  • Rule to set the maturity date: Here the maturity date should be set to "01/01/2001" for perpetual instruments (instruments for which the marketSector="Pfds") where the maturity date on the incoming file is null/blank.
    The ability to change a field based on the content of another field was another common rules definition (see Listing 2).
  • Rule for par: Par is calculated by dividing the legacy par amount by FX rate in case of old Euro currencies. We have an XML file containing old European currency codes and the exchange rate for Euro currency. We had many such rules that referenced legacy data for rules processing (see Listing 3).
  • Rule to obtain the number of days between two dates: Here we use Java functions to obtain the number of days between two dates. We had many rules involving date manipulation (see Listing 4).
  • Rule to extract time series elements: Here we show how to extract schedule data from an imported file. The ability to manipulate time series data from schedules was an important feature of the XSLT rules definitions (see Listing 5).

Reporting Tool
With active participation from the business user, we defined report templates, allowing the end user to create reports with minimal effort. In addition to the above report templates, we provided the capability to suppress known differences/exceptions in the daily comparison of data elements. Using the report/job scheduler, the user could save daily production reports in XLS/PDF formats and subsequently transfer the files to a shared network drive for distribution to other users.

Lessons Learned
The system has been well received by our client and continues to enjoy heavy use in their daily processing of convertible bond data. We achieved user satisfaction by empowering the business user with the ability to write and control the rules engine. From a technical perspective, this project helped remind us of several key points:

  • XSLT is a powerful framework for manipulating XML data.
  • XSLT can be written in simple, easy-to-understand declarative rules.
  • XSLT stylesheets can be used effectively to manage rules definitions.
  • Using a combination of SAX and DOM parsing with large files is effective and efficient.
  • The paradigm of modular design extends well to the organization of business rules.
  • About Sam Natarajan
    The author, Mr. Sam Natarajan, is the Founder and CEO of Harvest Technology Corporation (www.HarvestTechnology.com), a leading software solutions provider for financial services clients. Mr. Natarajan has 20 years of experience developing technology solutions in the financial services industry. He holds a Master’s Degree in Computer Science from the Stevens Institute of Technology and an MBA from New York University. Mr. Natarajan worked for over 12 years at Merrill Lynch building web applications, global databases, financial models, and trading and risk management systems. In addition, his time at Merrill Lynch included four years as a derivatives trader in the Fixed Income and Equity divisions. His principal duties as CEO of Harvest Technology Corporation include the oversight and management of our development team of Java programmers, XML developers, data modelers, and project managers.

XML JOURNAL LATEST STORIES . . .
A few years ago, a British newspaper speculated on what might be the Web equivalent of the Seven Wonders of the World, and received suggestions that were hardly surprising: Google search, the Amazon.com e-tail portal, the eBay auction mechanism, etc. But that was back in 1991, before F...
A round-up of the many themes and topics of interest to infrastructure architects, developers and IT managers featuring at SYS-CON's Cloud Computing Expo being held November 19-21, 2008 at The Fairmont Hotel in San Jose, California. The conference is expecting a record turnout of senio...
SYS-CON Events announced today that the leading global SOA, Virtualization, Cloud Computing and Open Source technology provider FreedomOSS named "Gold Sponsor" of SYS-CON's SOA World Conference & Expo which will take place November 19-21, 2008, at the Fairmont Hotel in the heart of Sil...
Cloud Computing offers significant benefits over traditional solutions for deploying production systems as well as for conducting development and testing activities. This session will distill the unique characteristics of clouds and describe how to best think about deployments in the c...
Intel has just released Intel XML Software Suite 1.2. This latest release helps maximize XML performance, while minimizing the effort for any Enterprise, SOA, SaaS, and Web 2.0 based applications. Intel XML Software Suite 1.2 optimizes XML application performance, takes full advantage ...
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON FEATURED WHITEPAPERS


ADS BY GOOGLE