Industrial IoT Authors: Pat Romanski, William Schmarzo, Elizabeth White, Stackify Blog, Yeshim Deniz

Related Topics: Industrial IoT

Industrial IoT: Article

Building XML Middleware Using Omnimark

Building XML Middleware Using Omnimark

XML is rapidly becoming the way applications communicate. Because it isn't one language but a means to create many languages, it can play many roles in application integration and data exchange. In this article I'll explore some of the varied roles that XML can play by developing a simple middleware application that serves up data from a database.

The application (see Listing 1) uses XML in three ways:

  1. Requests from clients are encoded in XML.
  2. Data sent back to a client is encoded in XML.
  3. The database itself contains data marked up in XML.
The application is written in OmniMark - a 12-year-old language with a long history in the SGML community - which became a free language in May 1999. It's based on the streaming programming model pioneered by OmniMark and subsequently adopted by SGML and XML translation languages like DSSSL and XSL. Streaming is also the principle at work in the SAX parser interface.

OmniMark is alone, though, in having developed the streaming model into a full, generalized programming language. The conventional approach to writing a program is to design a data structure, populate it, manipulate it and finally serialize it to create output; that is, it's a memory-centric approach. The streaming approach involves acting on the data directly as it streams, without creating data structures - it's an I/O-centric approach.

An OmniMark program is structured as a collection of rules, with different rule types used for different purposes:

  • The process rule fires when the application is started and contains the main processing logic.
  • Element rules fire when OmniMark's integrated XML parser finds an element.
  • Find rules are used to process textual data using a pattern-matching language similar in capabilities to regular expressions.
OmniMark provides an abstraction layer between program logic and data sources/destinations. All OmniMark programs operate on the program's current input and all output goes to the current output. You can attach any source or destination to current input and current output, locally or across a network. The program operates the same way on any data stream, independent of its source and destination.

Server Programming
A middleware application is a server program. Its main function is to receive requests from clients and respond appropriately. What makes the server middleware is that it requests information from yet another server to satisfy its clients' requests.

The two essential performance characteristics that every server program must have are:

  1. It must survive errors: If something goes wrong in processing a request, the server can't shut down. It must continue running so as to service the next request.
  2. It must return to a stable state after each request is completed: Whether or not the request was serviced successfully, the server should return to a steady state.
The basic anatomy of a server is simple:
  • The startup routine establishes the service and any resources it needs.
  • The request service loop receives requests and responds to them.
  • The shutdown routine cleans up any open resources and shuts down the server.
How to Establish a Service
OmniMark uses OMX (OmniMark extension) components to connect to external data sources. To establish a TCP service, I use the tcpServiceOpen open function found in the TCP/IP library. The function returns an OMX variable that's a handle to a TCPService OMX component that manages the TCP service. The function is called in the initializer of the "service" variable:
local tcpService service
initial {tcpServiceOpen at port-number}
How to Receive Requests
Once the service is established, the program must wait for a connection attempt from a client. This is accomplished with the TCPServiceAcceptConnection function, which takes the OMX variable representing the service and waits for a connection. When a connection is made, it returns another OMX variable, which is a handle to a TCPConnection OMX component that will manage the connection:
local tcpConnection connection
initial {TCPServiceAcceptConnection service}

Establishing the Request and Response Streams
Now that I have a TCP/IP connection, I need a way to talk to it. To output data to a TCP/IP connection, I must attach an OmniMark stream to the connection; all data written to that stream will then go to the connection. I do this with the "TCPConnectionGetOutput" function, which takes the connection OMX variable as a parameter:

local stream reply
open reply as TCPConnectionGetOutput connection
protocol IOProtocolMultiPacket
The statement "protocol IOProtocolMultiPacket" is a second parameter to the TCPConnectionGetOutput function. It establishes the protocol to use for writing the data. Because a TCP/IP connection is a two-way communication channel, the sender can't signal the end of its data by closing the channel; an I/O protocol is required to establish when a message ends.

OmniMark provides support for most common protocols through the IOProtocol library. This program uses the MultiPacket protocol, which breaks up the message into packets and sends each packet prefixed by a network-long value specifying its size. The message ends with a zero-length packet.

Opening the "reply" stream with the TCP/IP connection attached creates a vehicle for sending output to the TCP/IP connection. To actually send output to the connection, I have to make "reply" the current output stream. I do this using the statement "using output as":

using output as reply
The "using output as" statement is a prefix to the "do" block and establishes the current output for all the code that executes within the block, including any functions or rules called within the block. I can output data to the TCP/IP connection with a simple "output" statement anywhere within the output scope established by this statement.

To receive data from the TCP/IP connection, I use the "tcpConnectionGetSource" function. It takes the connection OMX and protocol parameters - just like "tcpConnectionGetOutput" - and returns an OmniMark source that can be used by any of the OmniMark keywords that accept sources - in this case the "scan" keyword:

scan tcpConnectionGetSource connection
protocol IOProtocolMultiPacket

How to Survive Errors and Stay Running
The request service loop consists of a repeat loop. In OmniMark all repeat loops begin with "repeat" and end with "again." The key to surviving an error in the course of handling a request is to catch the error in time to repeat the loop. I accomplish this by placing a single statement at the end of the request service loop:

catch #program-error
OmniMark's catch and throw keywords provide robust structured exception handling that you can use for both flow control and error handling. A throw isn't a GOTO. A throw starts a systematic process in which program scopes are closed one by one, starting with the scope in which the throw occurs and ending with the one in which the catch occurs. Garbage collection is automatic.

"#program-error" is a built-in catch name that I use here as the line of last defense. Any program error, any uncaught throw, any error in an external system that is not caught and handled by another catch will be caught here. The catch will result in the current iteration of the loop being shut down and tidied up. Control will then return to the top of the loop, starting a new iteration.

How to Return to a Stable State After Each Request
The prescription for returning the server to a stable state is simple: keep all your variables local to the request service loop. That way, whether the loop ends normally or with an error, the variables will be destroyed, the garbage cleaned up and the next iteration will start with a clean slate.

At one place in this program I violate this rule. I use a global variable for the database connection ("db"), trading off a little robustness for the performance advantage of maintaining a permanent connection to the database. You can use catch and throw to detect problems with the database connection and recover from them, but that's outside the scope of this article.

XML Processing
OmniMark's XML parser is fully integrated into the language. It doesn't need or use a DOM or SAX interface.

Once parsing is initiated, the parser fires markup rules for the various markup structures it encounters. The most common of the markup rules is the element rule. While SAX has separate events for the start and end of an element, OmniMark fires a single element rule for each element. Each element rule uses the parse continuation operator ("%c") to initiate parsing of the element's content. Element rules are thus fired hierarchically. Each rule is suspended while the element's content is parsed and resumes once the parsing of the content is complete.

Parsing is initiated by the "do xml-parse" statement:

do xml-parse instance
with xml-dtds {"request"}
scan tcpConnectionGetSource connection
protocol IOProtocolMultiPacket
output "%c"
The "with" clause specifies the DTD to use. In this program the request DTD is precompiled in the start-up section.

The "scan" clause specifies the source from which to read the XML document. In this case it's the source returned by the tcpConnectionGetSource function that attaches the OmniMark source to the TCP/IP connection. What this means is that the XML document will be streamed directly from the TCP/IP connection into the XML parser.

"do xml-parse" is a block statement. Within that block the parse state has been established, but parsing isn't actually in progress. Parsing is started by the parse-continuation operator ("%c"), which is roughly equivalent in function to the XSL statement apply-templates. "%c" is a string escape sequence that allows you to easily express where you want the content of an element to fall in the output stream.

Parsing the Information Requests
My XML language for requests is simple. The root element is request and the request element can contain a single element representing any one of the request types. Each request-type element ("product-by-type" and so on) has the data content appropriate to the information being requested, usually a database key value. In effect, this DTD describes a simplified database query language that is specific to the particular database I'm accessing. The middleware program acts as an interpreter for that language, converting it into standard SQL queries and returning the results of those queries with XML encoding.

The parser is started in response to the "%c" in the "do xml-parse" block. When it finds the request element, it fires the element rule for "request" and pauses. The "request" element rule has no work to do, so it simply restarts the parser with "%c".

element "request"
output "%c"
Suppose the request is for a list of selected products. The request element will contain an element, "selected-products", whose data content will be a comma-separated list of product IDs. The element rule for "selected-products" begins like this:
element "selected-products"
local stream query initial
{ "SELECT ProductID, "
|| "ProductName, "
|| "ProductPrice "
|| "FROM Product "
|| "WHERE "
|| "ProductID IN (%c)"
The "%c", which every element rule must contain, is tacked onto the end of the initializer for the variable "query". The data content of the "selected-products" element is streamed into the variable "query" and becomes part of the SQL statement that will be used to query the database. In effect, the product line ID has been streamed directly from the TCP/IP connection into the SQL statement.

Database Access
OmniMark uses OMX components to communicate with all external data sources, so communication with a database uses a database OMX provided by the omdb library. The connection to the database is established when the variable "db" is initialized as part of the start-up routine:

global dbDatabase db initial {dbOpenODBC dsn}
The dbOpenODBC function takes a parameter that is the data source name (DSN) used by the ODBC driver manager to identify a database. It returns a database OMX variable.

Querying the Database
Once I have a database connection and a completed SQL query, I query the database with the dbQuery function:

dbQuery db sql query record rs

The dbQuery function takes three parameters, the OMX variable for the database, the SQL query - heralded by the word "sql" - and an OmniMark shelf named "rs" - heralded by the word "record."

A shelf is an OmniMark data structure. It's an associative array, meaning that items can be addressed either by position or by a textual key value. "dbField" is an OMX variable type for an OMX component representing a database field. The "dbQuery" function will populate the "rs" shelf with "dbField" OMX variables representing the fields of the current record. The names of the fields will become the keys of the shelf.

After executing the query, the program checks to see if any records were returned; if not, it throws "record-not-found":

throw record-not-found unless dbRecordExists rs
This throw is caught by the statement:
catch record-not-found
output '<response status="notfound"/>'
Throwing out of an element rule terminates the current parse. The code in the catch block then outputs the XML message:
<response status="notfound"/>
Since the "do XML parse" block is within the output scope created by the statement "using output as reply", this output goes straight to the TCP/IP connection and to the client.

Figure 1 illustrates how data streams through the program. In this figure the top line shows the streaming of the request data from the TCP/IP port to the XML parser and into the SQL query. The query itself is passed as a function call to the database OMX, not streamed. The bottom line shows the streaming of the response data from the database to the find rules that escape markup characters in the text, the interpolation of the XML tagging by the program, and the streaming of the result to the TCP/IP port.

Building an XML Encoded Response
If the query does contain records, the program outputs the "response" element with the status "ok" and then loops over the records and outputs the data surrounded by the appropriate XML tags. At the end of the loop it outputs the "response" end tag. All this happens in the output scope in which the parse was initiated, so the output all goes to the TCP/IP connection:

output '<response status="ok">%n'
repeat exit unless dbRecordExists current-record
output '<product>%n<id>'
|| dbFieldValue current-record{"ProductID"}
|| '</id>%n<name>'
submit dbFieldValue current-record{"ProductName"}
output '</name>%n</product>%n'
dbRecordMove current-record
output '</response>'
Sometimes it takes more than one SQL query to collect the information needed to construct a response. This is the case for the product-by-line and product-by-type requests, both of which return a description of the product type or product line followed by a list of products. Because there are two separate requests, either of which can fail, I buffer the output until both queries have succeeded.

To do this, I attach the stream "response-buffer" to a buffer and make it the current output scope for the duration of the two queries:

local stream response-buffer
open response-buffer as buffer
using output as response-buffer
... done
close response-buffer
output response-buffer
After the block governed by "using output as response-buffer," I close the stream and output it. The original output scope was restored when the block ended, so the output once again goes to the TCP/IP connection.

Dealing with the Markup in the Database
The database contains XML markup in the description field. The markup is simple; a "description" element can contain paragraph ("p") elements, which can contain text or "prodref" elements. A "prodref" is a reference to another product in the database.

I deal with this markup by simple inclusion. One of the virtues of XML is that because of its linear nature and nested structure, the root element of one XML document can become an element in another document simply by dropping it in place. Because I control both the database and the server, I don't have to worry about namespace conflicts. In effect, the "description" language used in the database is just a subset of the "response" language used by the server.

Escaping the Markup Characters in the Data
The other database fields contain plain text data. Whenever you create XML from text data, you must escape the markup characters "<", ">" and "&" to prevent their being mistaken for markup by the parser receiving the XML. Naturally, OmniMark handles this in a streaming manner.

Rather than outputting the values of the fields directly, the program "submits" them. Submitted data is processed by find rules, which apply pattern-matching techniques to data streams. (The "dbFieldValue" function returns an OmniMark source, not a string, so the data is being streamed, not copied here.)

OmniMark supports a full pattern-matching language. However, this program requires only literal text matching. Here's the find rule for escaping the "<" character:

find "<"
output "& l t ;"
This find rule looks for "<" in the streaming data. When it finds it, it removes the matched character from the stream and outputs the escape sequence "& l t ;" in its place. All the find rules are active at once so all the replacements are done in a single pass. There's no need to worry about the "&" inserted by this find rule being seen and replaced by the rule "find "&"." All data not matched by a find rule will stream to the current output, which is still the output scope established for the parse: the TCP/IP connection.

Notice that the "description" field, since it's already in XML, isn't submitted for escaping.

Shutting Down the Server
To shut down the server, the program must exit the request service loop. The command to shut down takes the form of a request with a "die" element. The data content of the "die" element is a poison-pill string. The string received must match the program's poison pill or the request is ignored. If the poison pill matches, the "die" element rule throws to the "shut-down" catch, which is outside the request service loop. The program then runs to the end of the process rule and ends. Because OmniMark provides automatic garbage collection, no explicit cleanup of global resources (the TCP/IP service and the database connection) is required.

A Stub Client
I've provided a stub client (see Listing 2) so you can test the server. It makes a single product request. You can adapt it to test the other functions. The client translates the XML it receives into very basic HTML.

Sending a Request
To send a request to a server, a client must first open a connection to the server's port on the host machine. I do this using the "TCPConnectionOpen" function, which returns a "TCPConnection" OMX variable:

set connection to TCPConnectionOpen
on product-server-host
at product-server-port
Once I have a connection, I can send a request. Since the request is a single value, I use "set" rather than "using output as" and "output". As in the server, the attachment to the TCP/IP connection is provided by the "TCPConnectionGetOutput" function:
set TCPConnectionGetOutput connection
protocol IOProtocolMultiPacket
to '<request><product>'
|| product-id
|| '</product></request>'
Processing the Response
The response is streamed directly into the parser, just as in the server program. The only difference in the client is that the DTD (see Listing 3) is fed to the parser as text rather than being precompiled.
do xml-parse document
scan file dtd-file-name
|| TCPConnectionGetSource connection
protocol IOProtocolMultiPacket
output "%c"
The response is fed to the parser, which fires element rules as before. The element rules output simple HTML tagging in place of the XML tagging in the response.

The greatest virtue of XML is that it exploits the simplicity and universality of the common linear text stream. Streams are easy to create and easy to transmit; adding XML makes streams easy to interpret. This allows XML to serve many purposes in communication between applications. It allows you to build applications that are simple, elegant and easy to maintain.

Because it combines broad-based connectivity, a streaming programming model and an integrated parser, OmniMark is a good language for building the next generation of XML-enabled Internet applications.

XML Resource
OmniMark Technologies: www.omnimark.com

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

IoT & Smart Cities Stories
Moroccanoil®, the global leader in oil-infused beauty, is thrilled to announce the NEW Moroccanoil Color Depositing Masks, a collection of dual-benefit hair masks that deposit pure pigments while providing the treatment benefits of a deep conditioning mask. The collection consists of seven curated shades for commitment-free, beautifully-colored hair that looks and feels healthy.
The textured-hair category is inarguably the hottest in the haircare space today. This has been driven by the proliferation of founder brands started by curly and coily consumers and savvy consumers who increasingly want products specifically for their texture type. This trend is underscored by the latest insights from NaturallyCurly's 2018 TextureTrends report, released today. According to the 2018 TextureTrends Report, more than 80 percent of women with curly and coily hair say they purcha...
The textured-hair category is inarguably the hottest in the haircare space today. This has been driven by the proliferation of founder brands started by curly and coily consumers and savvy consumers who increasingly want products specifically for their texture type. This trend is underscored by the latest insights from NaturallyCurly's 2018 TextureTrends report, released today. According to the 2018 TextureTrends Report, more than 80 percent of women with curly and coily hair say they purcha...
We all love the many benefits of natural plant oils, used as a deap treatment before shampooing, at home or at the beach, but is there an all-in-one solution for everyday intensive nutrition and modern styling?I am passionate about the benefits of natural extracts with tried-and-tested results, which I have used to develop my own brand (lemon for its acid ph, wheat germ for its fortifying action…). I wanted a product which combined caring and styling effects, and which could be used after shampo...
The platform combines the strengths of Singtel's extensive, intelligent network capabilities with Microsoft's cloud expertise to create a unique solution that sets new standards for IoT applications," said Mr Diomedes Kastanis, Head of IoT at Singtel. "Our solution provides speed, transparency and flexibility, paving the way for a more pervasive use of IoT to accelerate enterprises' digitalisation efforts. AI-powered intelligent connectivity over Microsoft Azure will be the fastest connected pat...
There are many examples of disruption in consumer space – Uber disrupting the cab industry, Airbnb disrupting the hospitality industry and so on; but have you wondered who is disrupting support and operations? AISERA helps make businesses and customers successful by offering consumer-like user experience for support and operations. We have built the world’s first AI-driven IT / HR / Cloud / Customer Support and Operations solution.
Codete accelerates their clients growth through technological expertise and experience. Codite team works with organizations to meet the challenges that digitalization presents. Their clients include digital start-ups as well as established enterprises in the IT industry. To stay competitive in a highly innovative IT industry, strong R&D departments and bold spin-off initiatives is a must. Codete Data Science and Software Architects teams help corporate clients to stay up to date with the mod...
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
Druva is the global leader in Cloud Data Protection and Management, delivering the industry's first data management-as-a-service solution that aggregates data from endpoints, servers and cloud applications and leverages the public cloud to offer a single pane of glass to enable data protection, governance and intelligence-dramatically increasing the availability and visibility of business critical information, while reducing the risk, cost and complexity of managing and protecting it. Druva's...
BMC has unmatched experience in IT management, supporting 92 of the Forbes Global 100, and earning recognition as an ITSM Gartner Magic Quadrant Leader for five years running. Our solutions offer speed, agility, and efficiency to tackle business challenges in the areas of service management, automation, operations, and the mainframe.