YOUR FEEDBACK
andy.mulholland wrote: intriguing !!! We have full scale 'Mashup Factories' in Chicago USA and Utrec...
AJAXWorld RIA Conference
Early Bird Savings Expire Friday Register Today and SAVE !..


2008 East
DIAMOND SPONSOR:
Data Direct
Frontiers in Data Access: The Coming Wave in Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
Intel
Virtualization – Path to Predictive Enterprise
Green Hills
IT Security in a Hostile World
JBoss / freedom oss
Practical SOA Approach
GOLD SPONSORS:
Software AG
The Art & Science of SOA: How Governance Enables Adoption
PlateSpin
Effective Planning for Virtual Infrastructure Growth
Fujitsu
Automated Business Process Discovery & Virtualization Service
Ceedo
Workspace Virtualization
Click For 2007 West
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TODAY'S TOP SOA & WEBSERVICES LINKS


Defining Mainframe Transaction's Signature with an XML Schema; How To Convert Cobol Metadata
Converting Cobol metadata into an XML Schema using regular expressions processing

In a nutshell, we can define the XML Schema using primitive data types and derived data types defined using primitive or other derived data types. The primitive data types can be of any of the standard formats (for our application we will use just string and integer).

Simple datatypes are declared with the <simpleType> element and include the following basic attributes: name, base type, and they can contain a valid constraining facet. Complex datatypes are declared with the <complexType> element and they are defined by extension or restriction based on other datatypes.

Instead of referring a datatype defined in another portion of the same schema, derived data types can also nest datatype definitions, one inside the other as in:


<element name="COURSES"><complexType><sequence>
<element name="COURSE-ID"><complexType><sequence>
<element name="COURSE-TYPE"><simpleType><restriction
base="string">
<length value="04"/></restriction></simpleType></element>
<element name="SERV-LENGTH"><simpleType><restriction
base="integer">
<totalDigits value="05"/></restriction></simpleType>
</element>
</sequence></complexType></element>
</sequence></complexType></element>
Even when this kind of nested definition is less clear than the ones that use references, it will be useful for automating the generation of the XML Schema from the copybook as we will see soon. For a full description of the XML Schema obtained from the Cobol copybook see Listing 3.

Regular Expressions 101
In order to convert from Cobol to XML Schema we need to recognize certain patterns. For example we can build a rule saying that each group item in Cobol will correspond to a complexType in the schema, or that each elementary item containing a PIC clause will correspond to a simpleType. A useful artefact to recognize patterns in a text file is called a regular expression.

Regular expressions, called also regex, are used in several UNIX utilities and languages (Perl, awk, etc.). Regex allows us to locate a specific pattern or a particular sequence of characters in a string. This combination of characters is defined using a rather powerful syntax.

Regular expressions are built around the use of special characters that are matched against the actual string. These special characters allow us to create a template against which each portion of the compared text is matched and processed in a certain mode.

For example, the regular expression ^.PIC * will match a string starting (^.) with just one character followed by the string "PIC" and followed by 0 or more blanks (will match APIC, BPIC__, but will not match CCPIC - two characters before PIC- or PIC - no character before PIC-). As seen in this example, special characters play an essential role in regex definitions. The Table 1 introduces the most common special characters used in regex.

Even when this is a very basic list of special characters it will suffice for our project. For a more extended information about regular expressions see the reference section.

The Project
In order to convert a copybook into an XML Schema I defined some rules of conversion. To simplify the scope of this project I will leave out some Cobol artefacts such as arrays, and I will centralize my attention on the basic structure of the Cobol metadata. For homework you can try afterwards to extend the code in order to include these structures.

As said before, Cobol organizes the metadata in levels. To produce an XML Schema representation I will convert any level not including a PIC clause (that is any level that doesn't define a basic field) in a complexType. As one level usually includes other levels nested inside, I will nest the complexType definitions to mimic the Cobol definition, using the syntax seen in the XML Schema section.

The corollary of this rule is that any definition including a PIC clause will be considered a simpleType. We will use the length as a restriction in the definition of the field.

The Cobol example seen in the first paragraph:


01 COURSES.
02 COURSE-ID.
03 COURSE-TYPE PIC X(3).
03 COURSE-NUMBER PIC 9(5).
02 COURSE-NAME PIC X(20).
can be translated then, as a complexType called COURSES that is composed of one complexType COURSE-ID and a simpleType COURSE-NAME. COURSE-ID is composed, in turn, of two simpleType fields: COURSE-TYPE and COURSE-NUMBER.

So with these two simple rules I can try to produce the schema. Now I will explain the tool we will use to achieve this objective.

The Program
In order to automate the conversion of the XML Schema I coded a java program that uses regular expressions to do the job. The java program reads the file containing the copybook, matches record by record against a pattern defined by a regex, and then produces a schema definition in another file. Since the definitions are usually nested, we need to keep some track of levels opened in order to produce the closing tags (</complexType>, </element>, etc.).

The program uses a set of classes included in Jakarta (mainly under org.apache.oro.text). These classes give us the basic functionality to search based on regular expressions:

import org.apache.oro.text.awk.*;
import org.apache.oro.text.regex.MalformedPatternException;
import org.apache.oro.text.regex.Pattern;

The regex functionality is provided by the three classes: Pattern, AwkMatcher, and AwkCompile. AwkCompile allows compiling a regex as in:

Pattern pattern = compiler.compile("(\\sPIC)|(\\sVALUE)|
(^ *$)|(\\sCOPY\\s)");

The compiled pattern can be used afterwards to match against a string (contained here in an irecord variable) using an AwkMatcher object:

matcher.contains(irecord,pattern)

About Edgardo Burin
Edgardo Burin works for ING Canada as a solution architect in integration projects using webMethods. He works in different projects integrating mainframe transactions, MQ services, and Oracle databases using webMethods. He has more than 10 years of experience managing infrastructure. His areas of expertise are in Oracle databases, integration, and service-oriented architecture.

YOUR FEEDBACK
Carlos Magno wrote: Thank you, fantastic, i was search about Mainframe and xml, but i need know about how to write cobol screen on the jsp language by running browser.
Peter Prager wrote: COBOL copybooks and other COBOL and C structures can be easily converted with *XML Thunder* from Canam Software. The logic to consume or create XML in COBOL or C languages can also be generated.
Caroline Williams wrote: The last page of this article mentions "Listing 1". There is no link to "Listing 1". I would like to see the java program. Thanks for the excellent article.
Binoy Bastin wrote: You only talked about java .Is it possible to do the same thing using .net ,also how do you writeback xml to mainframe in this way
Edgardo Burin wrote: Defining Mainframe Transaction's Signature with an XML Schema; How To Convert Cobol Metadata. Integrating mainframe applications into an SOA often carries the burden of dealing with metadata in the form of Cobol Copybooks. This metadata converted to an XML Schema format can be useful for a range of applications (from validation to creation of services). This article explains how to automate the conversion from Copybooks to XML Schema using regular expression logic.
XML JOURNAL LATEST STORIES . . .
Intel, a leader in silicon innovation, develops technologies, products and initiatives to continually advance how people work and live. Intel XML Software Products help enhance productivity of XML and SOA application development by providing comprehensive, high performance XML processi...
The one thing that unifies the distributed computing style known as SOA, in most of its manifestations, is self-describing data via the Extensible Markup Language (XML). The benefits of XML over opaque message formats in data interchange are well established. No matter if your focus is...
Since its emergence, Web Service technology has gone a long way towards perfecting itself and finding its right application in the real world. With the maturity of the specifications, Web Service technology, with its power of interoperability, is now the major enabling technology of SO...
Join Scott Guthrie as he discusses Microsoft’s commitment to web standards development, Rich Internet Applications and how Microsoft is contributing to help move the web forward. Join Adobe’s Kevin Lynch as he demonstrates how Flash and HTML come together to make the most engaging,...
In a snit that Microsoft was able to push its OOXML file format through to ISO standardization, IBM, a big backer of the OOXML-opposing ODF file format, has instituted a new corporate policy that suggests it will pull out of standards bodies whose rules don’t conform to what it think...
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON FEATURED WHITEPAPERS


ADS BY GOOGLE
BREAKING XML NEWS
SOA Software, a leading Integrated SOA Governance Automation vendor, today announced that it has exp...