| By Karl Schwamb, Kenneth Hughes, Hans Tallis | Article Rating: |
|
| September 6, 2002 12:00 AM EDT | Reads: |
8,873 |
The publishing industry has long used markup languages in the
production of publications to a general readership. Now XML greatly
facilitates the design of publishing systems whose capabilities
include not only control of formatting, but also control of content
selection according to the needs of the individual reader.
Customers of financial services firms don't need to settle
for advice written to a general readership; they can receive plans so
personalized and uniquely suited to them that they'll feel a team of
experts conferred regarding their situations and wrote specific,
200-page recommendations for planning their financial futures.
Through the application of XML standards and technologies to the financial planning process, the system described in this article automatically produces highly personalized financial plans. The article emphasizes the role XML plays in such a financial planning system, and presents techniques that are applicable to any document personalization system that must operate and evolve in a production setting.
Specifically, we discuss the use of:
- XMLC in the Web presentation layer for managing the data that drives personalization
- XSLT for assembling personalized documents
- DocBook for representing personalized documents
- Meta-XSLT for generating user-readable documentation of the personalization process
We receive simple personalized documents nearly every time we sort the day's mail - utility company invoices and brokerage account reports are familiar examples. Basic document generation packages that support the production of these documents provide certain simple features:
- Inclusion of parameterized numeric and string data
- Limited use of custom graphics, such as an asset-allocation pie chart or savings-goal thermometer
- Limited conditional text logic, typically at the "paragraph" level of granularity
- Performance supporting upwards of 10 million reports per month
What follows describes an XML-based document production system that extends the functional and technical capabilities of those simple packages. The features include:
- Flexible structural control, including document organization with hyperlinks
- Production-quality print graphic format (scalable vector format) and efficient Web graphic format (raster)
- Full layout control capabilities (e.g., headers/footers, tables, flowing text)
- Multiple output formats (PDF, RTF, XHTML, ASCII text)
- High-visibility and direct-manipulation control of document content; clients review and directly edit the document generation templates
- Unlimited complexity supported in the logic of text inclusion, allowing arbitrary Boolean expressions to control conditional text inclusion. Conditional text may be as fine grained as the subword level (e.g., verb tense endings) or as coarse grained as entire chapters
- Platform portability through Java implementation
- Standards-based protocols and languages
These features allow us to meet the demanding needs of high-volume financial plan generation. Comprehensive financial plans that cover areas such as investment, retirement, and estate planning, while simultaneously addressing tax implications, insurance sufficiency, and risk tolerance, can be over 200 pages. The input information for such plans numbers in the hundreds of data items and spans areas such as income, expenses, family educational goals, and retirement lifestyle.
Subtle changes in data values can have ripple effects throughout the plan, affecting numerical results, item cardinalities, and even the suitability of entire chapters. The directed alteration of input data in "what-if" scenarios requires rapid turnaround. Geographical distribution is necessary for user and customer convenience, input data collection, review of generated reports, and support of international users. Once the financial data has been analyzed and alternatives have been considered, high-quality financial plan documents are printed and delivered to customers.
To produce a high-quality, branded presentation for the Web site supporting the effort, we used a separate Web design firm to develop the look-and-feel of the Web pages. The site was designed in several iterations, requiring several updates for the programming team. This cooperative effort required careful separation of the work of the design team versus the programming team.
The group of certified financial planners and legal counsel advising the effort is even more critical to its success. This team must carefully monitor the wording of the financial reports to ensure they are correct and complete over the many combinations that can be generated by the system. It's important for such a team to have good visibility into the document generation process.
Architectural Overview
It was decided early on in the project to utilize a Web-based
interface to the application. This addresses the need to (1) handle a
geographically dispersed and mobile user community, (2) support a
variety of user platforms, and (3) comply with open standards
important to IT departments. To address the need for multiple user
roles (such as field users, managers, administrators, reporters), the
Web interface supports many configurations. The requirement that a
separate organization develop the look-and-feel of the application
meant the presentation specification and its programmatic
implementation should be as loosely coupled as possible.
These requirements were satisfied quite well using the XMLC tool, which produced a much cleaner separation between presentation and implementation than server page technologies. Figure 1 illustrates how page designers produced validated XHTML that Java developers could use in the application server without manual modification. (This process is described in more detail in the next section.)
Once the customer data is captured, it's analyzed and used to produce a customized financial plan. The customer data is stored in a conventional relational database - it's extracted by custom Java code that performs numerous calculations to analyze the customer's situation and produce tailored recommendations. The analysis results are represented in a custom XML document that's completely data-centric. XSLT stylesheets are employed in a "pull" style to conditionally include text fragments and graphics. The target representation of the result is in DocBook, a rich XML application for representing books and articles. This approach to generating a personalized financial plan in a generic format is shown in Figure 2. (The process is elaborated in a later section.)
Once a financial plan has been generated in the intermediate DocBook form, the standard DocBook XML stylesheets are used to produce output in XHTML (for the Web) or RTF (for postprocess manual customization). PDF output, for high-quality print production, is also produced with the assistance of custom XSL-FO stylesheets. The fact that source code for these standard stylesheets is available is an important risk-reduction factor; they can be customized to produce the desired output styling, if necessary. Since they were included with DocBook, this greatly reduced the amount of custom code needed to produce high-quality output. Figure 3 illustrates the use of the standard DocBook stylesheets. (Further information on this approach is presented later.)
When and how text fragments are included in a financial plan is a highly sensitive issue. Hence it's important that CPAs and legal advisors have visibility into the process. This is achieved by automatically producing documentation from the stylesheets used to generate personalized financial plans. A meta-XSLT stylesheet is used to translate the financial plan generation stylesheets into readable documentation. This automatic process ensures that the documentation always reflects the current state of financial plan generation (see Figure 4). Once the documentation in DocBook is produced, the same process illustrated in Figure 3 is used to present the documentation to the content and legal experts of the business sponsor. (This process is described in more detail later.)
The processes illustrated in the figures are carefully coordinated to reduce user wait time. The online Web interaction that manages customer data is designed to operate on current information in the database. A queue of user requests feeds separate processes that implement Figures 2 and 3. When the user requests have been completed, generated documents can be retrieved for viewing or sent for hard-copy print production. This allows for rapid Web interaction through a pool of machines configured for high user interaction. A separate pool of machines consumes and satisfies requests for document production. Since the document production task is batch-oriented, those machines utilize a different configuration. The process illustrated in Figure 4 is an offline process performed once per build for the entire application.
Managing Personal Financial Data
To manage customers' personal financial data, financial
analysts across the country need convenient access to centralized
financial records. All of the usual advantages of Web-based systems
(user familiarity with the interface, universal availability of
browsers, ubiquity of network infrastructure, etc.) applied to this
system. One particularly salient characteristic is the amount of
information that must be managed for each customer. A team of
specialists in user interface and graphic design sketched nearly 100
form-based pages for the management of customer data. Coordination of
the initial development and subsequent maintenance of these pages
between graphic design and programming teams was greatly facilitated
by the development of a unified framework for page management based
on XMLC.
XMLC compiles XHTML into Java code that can be manipulated programmatically at page presentation time to include dynamic information. Graphic designers can create page layouts using their favorite editing systems, preview their work in standard Web browsers, and validate the XHTML encoding using an XML validation tool. Programmers can independently use XMLC to compile the completed pages into a Java-based representation that can be manipulated at page presentation time via a standard DOM-based API.
XMLC's separation of page-layout from back-end development provided substantial benefit to the project. Additionally, because there are so many pages to control and XMLC provided complete presentation-time control over each individual page, further gains were achieved by identifying common page population and database updating code that could be factored out across the ensemble of pages. This idea was taken to the extreme of writing an interpreter, called PageManager, that handles all page population and database updating.
PageManager uses the XMLC-generated DOMs of each page to examine the IDs placed on elements destined to receive dynamic content. Each ID is built from a simple grammar that is expressive enough to represent the customer's financial data in terms of scalars, lists, and constraints. PageManager interprets the IDs, uses reflection against Java-based domain objects (backed by an Oracle or DB2 database) to retrieve dynamic data, updates the DOM representation of the page, and writes out the XHTML representation at page presentation time. PageManager updates DOM representation to populate forms, label buttons and column headings, or add rows to tables, for example. It also writes "name" attributes using a special grammar that allows PageManager to update the Java-based domain objects when the page is returned after user editing.
While a full description of PageManager's ID and name grammars is beyond the scope of this article, an example will serve to highlight the basic operation. The following XHTML tag is a typical prototype row adorned with a PageManager ID that will cause the row to be repeated for each current customer asset with a CategoryUID equal to INVESTMENT_ASSET:
<tr id=":Profile:Asset.CategoryUID-INVESTMENT_ASSET">
Within such a prototype row might be an <input type="text"/> tag with an ID that directs PageManager to populate the control with the name of the institution related to the asset:
id=":Profile:Asset.CategoryUID-INVESTMENT_ASSET.InstitutionName"
PageManager would automatically assign a "name" attribute for each such input so that during form submission it can readily identify value originations and update the database appropriately:
name="Asset.UID-12345.InstitutionName"
Similar ID and name attributes guided the population of other controls, the updating of changed fields to the database, and the navigation to detail pages associated with data in the dynamically generated rows.
Controlling Plan Contents
via XSLT
Financial plan personalization is the centerpiece of the
system. The Java-based (and Oracle- or DB2-backed) domain objects
mentioned in the "Managing Personal Financial Data" section represent
all of the financial data relevant to a particular customer. These
domain objects are used to perform various calculations and are then
serialized to XML, which in turn serves as the input to an XSLT-based
document generation process that produces the personalized financial
plan in DocBook format. The DocBook-based plan is then processed to
produce customer-readable financial plans in XHTML, PDF, and RTF (as
described in the following section).
A pull-based XSLT processing model is used to produce the plans. XSLT templates conditionally include DocBook elements and financial planning content according to XPath conditionals against the XML representation of the Java domain objects and calculation results. Sections, paragraphs, sentences, words, subwords, and punctuation are included or excluded according to the financial planning knowledge applied against the customer's financial data and goals.
Using DocBook to Represent Plans
Financial plans are represented in DocBook because of the
maturity and flexibility of the tools for processing that format. We
were able to produce immediate results using standard DSSSL
stylesheets for XHTML, RTF, and PDF. The XHTML produced was of
satisfactory quality; RTF required some DSSSL customization to get
acceptable results; PDF output was very rough, especially in the area
of table formatting. We chose the DSSSL approach initially because
the XHTML and RTF results were good, and the PDF results appeared to
be close. The XSL stylesheets were relatively immature, as were the
XSL-FO processors that were needed to produce PDF output.
However, a change in sponsor requirements led to a deemphasis of the need for RTF and a focus on the need to customize the PDF output style. The complexity of the DSSSL/OpenJade pipeline to PDF (DocBook->TeX->PS->PDF) frustrated our efforts to get clean PDF output. For example, finding how to fix footnote-formatting problems within tables along that pipeline proved to be daunting. When faced with the additional need to provide extensive stylistic customizations, we had little confidence in our ability to cajole the DSSSL-TeX-PDF pipeline into successfully producing the requested effects.
The purely XSL alternative to the DSSSL approach was still very immature. However, by working closely with the vendor of one of the leading XSL-FO processors (XEP, by RenderX), we eliminated the critical obstacles and proceeded to customize the default DocBook stylesheets in XSL to achieve the effects requested, such as:
- Scalable, vector-based EPS graphics for charts, graphs
- Tables involving cell spanning, proportional column widths, column alignment
- Customized styling for section headings and footnotes
- Graphics (logos and other stylistic designs) in headers, footers, margins
- Table of contents (TOC) customizations, table and figure TOCs removed, first chapter moved before TOC, etc.
The elegance of the XSL method of styling output is noteworthy. Mapping from an XML document to other XML documents (XHTML or XSL-FO) to produce output in the required styles using XSLT templates is a natural and effective way to specify style. DocBook's default XSL stylesheets provided a handy default that could be customized effectively by overriding selected XSLT templates.
Documenting Plan Content Control via Meta-XSLT
The pull-based XSLT processing model usually has the
advantage of being relatively easy to read because the statements of
conditional processing are embedded within a context of the targeted
document. Even so, nontechnical people are quickly distracted, or
worse, confused, when confronted with strange-looking XPath and other
XSLT constructs, especially when embedded within DocBook tagging,
which itself is unfamiliar to them. Furthermore, the high degree of
personalization described in the previous section makes for such a
high control/content ratio that only the most die-hard XSLT jocks
like looking at the massively conditional templates.
Our solution to the problem of how to produce a readable specification of the conditional processing embodied in the XSLT templates was, perhaps surprisingly, to write more XSLT templates. The second set of templates forms a transformation from the original, difficult-to-read set of templates in XSLT to a rendition in DocBook that uses English-based descriptions of the conditional processing. This easily read specification in DocBook is then processed by the same stylesheets used to customize individual financial plans, so the business analysts and legal advisors can review the conditional text processing in context and in the exact style of the final product.
For more details on how a small number of push-based XSLT templates produces documentation on a very large set of pull-based XSLT templates in terms that a business analyst could easily read, see our "Document XSLT Automatically" at www.sys-con.com/xml/article.cfm?id=419, a companion article in the June 2002 issue (Vol. 3, issue 6) of XML-Journal.
Discussion and Conclusions
The approach described above has produced the most advanced
financial planning document pipeline, in large part because of the
leverage of XML technology. While the project requirements were
numerous and demanded a great deal of flexibility, XML was up to the
task.
Despite the success of the above approach, there are a few areas for improvement. It would have been helpful to have a single XML format for defining analysis calculations that could be used both to present these calculations to the project sponsors and to generate implementation code. These calculations are described in standard industry publications.
While MathML holds promise in that it can describe the presentation of calculations and be incorporated into DocBook, we know of no way to generate implementation code from this MathML markup. The addition of such a capability would, of course, provide visibility into a large requirements area that currently requires a great deal of manual effort.
Graphics were a challenge due to the need to support multiple formats since each document presentation required a different graphic format. Multiple graphic formats were produced in the document generation pipeline since the presentation output was not known until the end of the process. It would be much better if SVG were natively supported in browsers, Microsoft Word, and PDF.
Another area for improvement is the representation and presentation of business rules. Some key areas of financial analysis are best represented by rules that can be executed within a rule-based system. While several rule standards have been proposed, such as SRML, RuleML, and BRML, none have the status of an approved standard, and no path exists for documenting the rules.
Taking a wider look at the financial planning landscape, it's clear that once a plan is produced, customers prefer to have their plans monitored as they make updates to their financial situation. Many of the XML standards that already exist for trading and exchanging financial transaction data, such as IFX, could be used to update customer data over time and to perform plan monitoring. The monitoring activity can be used to make automatic portfolio adjustments, notify customers of deviations from plan goals, and cross-sell and up-sell financial instruments.
References
- XMLC: http://xmlc.enhydra.org
- DocBook: www.docbook.org
- MathML: www.w3.org/TR/REC-MathML
- DocBook: www.oasis-open.org/docbook/xml/mathml/index.shtml
- SVG: www.w3.org/Graphics/SVG/Overview.htm8
- SRML: http://xml.coverpages.org/srml.html
- RuleML: www.dfki.uni-kl.de/ruleml
- BRML: http://xml.coverpages.org/brml.html
- IFX: www.ifxforum.org
Published September 6, 2002 Reads 8,873
Copyright © 2002 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
About Karl Schwamb
Karl B. Schwamb (kbs@colonnadesoftware.com) is President of Colonnade Software (http://www.colonnadesoftware.com/), a consulting firm specializing in distributed systems. He has led systems design and development efforts for several Fortune 500 companies, primarily in the area of Financial Services. Many of these systems employ cutting edge technology such as XML, Java, middleware, and intelligent systems in environments that demand high-availability, high-throughput, and security.
About Kenneth Hughes
Kenneth J. Hughes (kjh@entel.com) is President of Entelechy
Corporation (http://www.entel.com), a consulting firm that specializes in XML. He received a BS in Electrical and Computer Engineering and Mathematics in 1985 and a MS in Electrical and Computer Engineering in 1988 from Carnegie Mellon. He has provided strategic guidance, architectural design, and hands-on development for organizations seeking to apply XML to both traditional publishing and Internet-based systems.
About Hans Tallis
Hans Tallis is a founder of ExpLore Reasoning Systems (www.ers.com), an automated financial planning consultancy.
He also designs highly available expert systems for the mortgage, airline, and network industries.
![]() |
Ed Dodds 09/14/02 09:56:00 AM EDT | |||
Does XHTML imply xbrl? |
||||
- AJAX World RIA Conference & Expo Kicks Off in New York City
- Ulitzer’s Amazing First 30 Days in Public Beta
- "Government IT Expo" to Highlight Cloud Computing and SOA
- Ulitzer vs. Ning - a Quick Review
- Improving the Efficiency of SOA-Based Applications
- Make Your Design Ideas Speak: Using UML in PowerBuilder Projects
- Ted Weissman and Lois Paul & Partners PR Firm
- SOA to Reduce Complexity?
- VMware Poaches CA Exec to Run Asia Pacific
- Cisco to Buy Tidal Software
- AJAX World RIA Conference & Expo Kicks Off in New York City
- Building the Right Project Team: The Rule of Five
- Ulitzer’s Amazing First 30 Days in Public Beta
- "Government IT Expo" to Highlight Cloud Computing and SOA
- DataDirect Data Integration Suite Features XQuery 4.0, XML Converters and Stylus Studio 2009
- Reducing Development Costs with SOA
- Macrovision White Paper Showcases Digital Entertainment Media
- Software AG Releases Tamino XML Server for SOA Interface
- Dajeil Launches Xerces/Xalan Hardware Accelerator for XML and SOA
- Ulitzer vs. Ning - a Quick Review
- AJAX World RIA Conference & Expo Kicks Off in New York City
- JSON vs XML - A Jason vs Freddie Sequel
- Processing XML with C# and .NET
- i-Technology Viewpoint: The Very Confused World of 3D and XML
- BPEL Processes and Human Workflow
- Open Source Database Special Feature: An Introduction to Berkeley DB XML
- "HP's Problem Ain't the SAP Install," Says Sun's Schwartz
- eXist - An Introduction To Open Source Native XML Database
- Digitizing the Planet: Google Earth vs MSN Virtual Earth vs MapQuest
- Product Review: Altova Enterprise Suite 2005








































