Welcome!

XML Authors: Katharine Hadow, Greg Schulz, Ambal Balakrishnan, Jeff Scholes, Brad Abrams

Related Topics: XML

XML: Article

Generate PDF Files with XML,XSL-FO, and FOP

Generate PDF Files with XML,XSL-FO, and FOP

This article will give you enough information to use the major features of XSL Formatting Objects (XSL-FO) in conjunction with Apache's FOP API for rendering documents in Adobe's Portable Document Format (PDF).

The W3C's specification for Extensible Stylesheet Language comes in two parts:

  • XSLT: A language for transforming XML documents
  • XSL-FO: An XML vocabulary for specifying formatting semantics

FOP (Formatting Objects Processor), which is part of Apache's XML project, is the world's first print formatter driven by XSL formatting objects. It's a Java application that reads an XSL-FO file and renders the output in PDF format. Other formats supported are XML, SVG, PS, PCL, Print, AWT, MIF, and TXT. To dig deeper, you may want to visit http://xml.apache.org/fop.

This tutorial uses Sun's JAXP API for XSLT transformation and Apache's FOP API for rendering PDF output. We'll use a Journal Subscription form that allows the user to enter details like name, payment mode, and bank details to subscribe to a journal. The form is a simple JSP page. Once the form is submitted, the request is forwarded to a servlet that captures the form details and constructs an XML string. XSL-FO stylesheet is applied to the dynamically created XML string and then transformed using JAXP API. The intermediate ".fo" file created as a result of transformation is used as input by the org.apache.fop API for rendering PDF output.

The following steps are used to create our subscription form in PDF format:
1.   Create an XSL-FO stylesheet.
2.   Transform the XML XSL-FO using JAXP API to produce an intermediate ".fo" file.
3.   Use org.apache.fop API to convert the ".fo" file to PDF.

Creating the XSL-FO Stylesheet

Listing 1 is the outline of a simple XSL-FO stylesheet.

  • <fo:root is the root element.
  • <fo:layout-master-set> encapsulates the simple-page-master.
  • <fo:simple-page-master defines the page height, width, top and bottom margins, and left and right margins. It has a master name that is referred by the page that uses the characteristics of a page defined in the <fo:simple-page-master>.
  • <fo:page-sequence defines the page-sequence-master and refers to the simple-page-master by using the master-reference attribute. This means that the characteristics of a page defined in the simple-page-master will be used by the page-sequence-master using the master-reference attribute. The value specified in the master-reference attribute must be the same in both the simple-page-master and the page-sequence-master. You may also specify the order in which a given simple-page-master will be used by the page-sequence-master.
  • <fo:flow defines the contents of a page that flows into the xsl-region-body.
  • <fo:block defines the block where the actual contents that appear in a PDF document are placed. Each fo:block prints the contents on a new line. The fo:block can contain <fo:inline elements. You may also place tables within an fo:block.

    To place the contents in each column of a table, insert the fo:block element in the fo:table-cell element as shown in Listing 2.

  • <fo:table: Equivalent to an HTML <TABLE> tag
  • <fo:table-column: Defines the number of columns used in a table - here, one
  • <fo:table-row: Equivalent to an HTML <TR> tag
  • <fo:table-cell: Equivalent to an HTML <TD> tag

    You may nest tables within tables similar to what we do in HTML. To place another table in the preceding example, insert a complete set of <fo:table></fo: table> inside the <fo:block></fo:block>. The Subscription.xsl in our case study uses nested tables, as seen in Listing 3.

    A stylesheet now needs to be applied to each element in the XML String document that is constructed dynamically in the servlet. For the full source code of Subscription.xsl see the note at the end of this tutorial.

    Using the org.apache.fop API
    As mentioned earlier, the XML file will be generated dynamically using the values entered in the Subscription.jsp form. Subscription.xsl is applied to the XML document that's created dynamically and transformed using JAXP API.

    Import the following org.apache.fop classes in XSLTOPDFServlet:

    import org.apache.fop.messaging.
    MessageHandler;
    import org.apache.fop.apps.Driver;
    import org.apache.fop.apps.*;
    import org.apache.log.*;

    Using the JAXP API
    The "Subscription.fo" file in Listing 4 created in the previous step (foFile) is used to create the PDF document using the fop API in Listing 5. The complete source code and Subscription.pdf can be downloaded from www.sys-con.com/xml/source.cfm. You can download org.apache.fop API from http://xml.apache.org/fop.

    I tested this application using BEA WebLogic Server 6.1. For instructions on how to set up and run the example, please refer to the README file included in the zip.

    If you don't have access to a Web server, download the FOP API and run the standalone Java program that comes as part of the fop API download. For instructions, please refer to the README file included in the zip.

  • More Stories By Suresh Selvaraj

    Suresh Selvaraj is a Sun Certified Java Programmer. He currently works as IT Analyst, Java Developer for Tata Consultancy Services in India.

    Comments (1) View Comments

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


    Most Recent Comments
    Eric Rose 07/14/08 09:20:30 AM EDT

    In this article you mention the "apache.org.fop.apps.Driver" class; this class doesn't appear in the fop.jar file in the 0.94 release of FOP from Apache. What version of FOP was this example written for? Is there a new methodology which has replaced the use of the Driver class?