Industrial IoT Authors: Yeshim Deniz, Elizabeth White, William Schmarzo, Liz McMillan, Pat Romanski

Related Topics: Java IoT, Industrial IoT

Java IoT: Article

How OpenKM's Technical Debt Decreased by 49% Through Code Refactoring

Initial Technical Debt of the project reduced from 84 to 42 days of remediation

Technical Debt is worth nothing if no pragmatic action is taken into code, in order to control and tackle it. To ilustrate the Scertify's capability to automatically correct code defects that increase this unintended debt, we performed code refactoring on OpenKM, an Free/Libre document management system. The initial Technical Debt of the project has been reduced by 49.2% from 84 days to 42 days. Here, at Tocea, we call it the Debt Write-Off.

For this first Debt Write-Off, we have decided to perform the refactoring of OpenKM (6.2.1-DEV).

According to Wikipedia, OpenKM is a Free/Libre document management system that provides a web interface for managing arbitrary files. OpenKM is a great tool but an audit of the code revealed some technical debt problems. That was a good opportunity to use Scertify and to be useful to an open-source community. The application consists mainly in 200K lines of code of Java. There is also Javascript, JSP, CSS... but we focus here on the Java code.

Technical debt before refactoring

Scertify Refactoring Assessment allows us to estimate the technical debt of the application. As you can see on screenshot #1, it is estimated to 84 days. This is the time needed to correct manually each error. This number only includes the time needed to make the change on the code, it does not include things like finding the file, understanding the problem, etc.

Of this 84 days, 60 represent errors that can be automatically refactored, thus taking nearly zero effort to correct.

We can take a closer look on the possibilities of automation (screenshot #2). Not all rules are currently implemented in Scertify, but we are working on it. For this project, we chose 7 rules that seemed particularly interesting.

Rules used for the refactoring

Here's a presentation of the rules used to perform the refactoring of OpenKM.

  • AvoidPrintStackTrace

    This rule reports a violation when it finds a code that catch an expression and print its stack trace to the standard error output. A logging framework should be used instead, in order to improve application's maintainability.
    The refactoring replace a call to print stack trace by a call to a logging framework. The rule can also declare the logger in the class and make the required imports. The rule can be configured to use the user's favorite framework.

    Here's the configuration used for OpenKM:

    • The logger call to use: "log.error({0}.getMessage(), {0})" {0} is replaced by the exception.
    • Do not refactor calls of printStackTrace to other IO (a file, a stream...)
    • Make logger declaration when it's needed (ie: log is not already declared in class).
    • The logger declaration to use : "private static Logger log = LoggerFactory.getLogger({0}.class);"
    • The required imports : "org.slf4j.LoggerFactory,org.slf4j.Logger"

    Original code :

    view source print? 1.catch(FileNotFoundExceptione){ e.printStackTrace(); }
  • Refactored code :

    view source print? 1.catch(FileNotFoundExceptione){ log.error(e.getMessage(),e); }
  • AddEmptyStringToConvert

    Using the concatenation of an empty string to convert a primitive type to a String is a bad practice. First of all, it makes the code less readable. It is also less efficient in most cases (the only case where the string concatenation is slightly better is when the primitive is final).

    Original code:

    view source print? 1.UserActivity.log(session.getUserID(), "DELETE_PROCESS_DEFINITION",""+processDefinitionId, null,null);

    Refactored code:

    view source print? 1.UserActivity.log(session.getUserID(), "DELETE_PROCESS_DEFINITION", String.valueOf(processDefinitionId),null,null);
  • InefficientConstructorCall

    Calling the constructor of a wrapper type, like Integer, to convert a primitive type is a bad practice. It is less efficient than calling the static method valueOf.

    Original code:

    view source print? 1.users.put(usersRead[i].getString(), newInteger(Permission.READ));

    Refactored code:

    view source print? 1.users.put(usersRead[i].getString(), newInteger(Permission.READ));
  • IfElseStmtsMustUseBraces

    This rule finds if statements that don't use braces. The refactoring adds required braces.

  • PositionLiteralsFirstInComparisonsRefactor

    This rule checks that literals are in the first position in comparisons. The following code is a violation :

    Original code:

    view source print? 1.if(action.equals("ruleList"))

    Refactored code:

    view source print? 1.if("ruleList".equals(action))

    The refactoring invert the literal and the variable. This ensures that the code cannot crash due to the variable being a null pointer.

  • MethodArgumentCouldBeFinal

    This method flags method's arguments that could be declared final and are not. The use of the final keyword is a useful information for future code readers.

  • LocalVariableCouldBeFinal

    The purpose is the same as the previous rule, except that it treats local variable and not arguments. These two rules are not critical, but since they have a huge number of violations, it is useful to get rid of them quickly with automatic refactoring.

We are now ready to perform the refactoring with Scertify.

    The refactoring process

    The refactoring process consists of two steps :

  1. Configure a xml rule repository: The first step is crucial. As we have seen in previous section, some rules need to be configured to be useful. However, it shouldn't take more than half an hour.
  2. Run Scertify to perform the refactoring: The second step is just a command line invocation, where you specify the project to refactor and the rule repository to use.

For this project of 200K lines of code, the refactoring took 2 minute. You can check the process on a smaller project in this video tutorial.

Technical debt after refactoring
Screenshot #3 is the analysis of the refactored project by Scertify Refactoring Assessment. As you can see, 24 days of technical debt have been erased.

Screenshot #4 and #5 show the difference of violations in Sonar between the original and the refactored project.

Here's the number of violations that have been corrected for each rule (*) :

  • AddEmptyStringToConvert: 232
  • AvoidPrintStackTrace: 70
  • InefficientConstructorCall: 43
  • IFElseStmtsMustUseBraces: 411
  • PositionLiteralsFirstInComparisons: 358
  • MethodArgumentCouldBeFinal: 8848
  • LocalVariableCouldBeFinal : 7622

To sum up, with Scertify we've been able to correct quickly a huge number of errors. Some of them are not critical (like MethodArgumentCouldBeFinal) but we've also been able to refactor more evolved errors like AvoidPrintStackTrace, AddEmptyStringToConvert,...

Download the source files

(*) if you do the math, you'll see that more errors have been corrected. It's due to side effects of the refactoring (correcting a rule can remove violations of other rules) and also because we manually corrected few things in the code.

Submit your project for a Debt Write-Off

If you're interested in submitting your project to the next Debt Write-off, or just give some valuable feedback... Please contact us on Twitter: @Scertify

More Stories By Michael Muller

Michael Muller, a Marketing Manager at Tocea, has 10+ years of experience as a Marketing and Communication Manager. He specializes in technology and innovative companies. He is executive editor at http://dsisionnel.com, a French IT magazine and the creator of http://d8p.it, a cool URL shortener. Dad of two kids.

@ThingsExpo Stories
We're entering the post-smartphone era, where wearable gadgets from watches and fitness bands to glasses and health aids will power the next technological revolution. With mass adoption of wearable devices comes a new data ecosystem that must be protected. Wearables open new pathways that facilitate the tracking, sharing and storing of consumers’ personal health, location and daily activity data. Consumers have some idea of the data these devices capture, but most don’t realize how revealing and...
A completely new computing platform is on the horizon. They’re called Microservers by some, ARM Servers by others, and sometimes even ARM-based Servers. No matter what you call them, Microservers will have a huge impact on the data center and on server computing in general. Although few people are familiar with Microservers today, their impact will be felt very soon. This is a new category of computing platform that is available today and is predicted to have triple-digit growth rates for some ...
SYS-CON Events announced today that MathFreeOn will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. MathFreeOn is Software as a Service (SaaS) used in Engineering and Math education. Write scripts and solve math problems online. MathFreeOn provides online courses for beginners or amateurs who have difficulties in writing scripts. In accordance with various mathematical topics, there are more tha...
In past @ThingsExpo presentations, Joseph di Paolantonio has explored how various Internet of Things (IoT) and data management and analytics (DMA) solution spaces will come together as sensor analytics ecosystems. This year, in his session at @ThingsExpo, Joseph di Paolantonio from DataArchon, will be adding the numerous Transportation areas, from autonomous vehicles to “Uber for containers.” While IoT data in any one area of Transportation will have a huge impact in that area, combining sensor...
SYS-CON Events announced today that SoftNet Solutions will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. SoftNet Solutions specializes in Enterprise Solutions for Hadoop and Big Data. It offers customers the most open, robust, and value-conscious portfolio of solutions, services, and tools for the shortest route to success with Big Data. The unique differentiator is the ability to architect and ...
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
More and more brands have jumped on the IoT bandwagon. We have an excess of wearables – activity trackers, smartwatches, smart glasses and sneakers, and more that track seemingly endless datapoints. However, most consumers have no idea what “IoT” means. Creating more wearables that track data shouldn't be the aim of brands; delivering meaningful, tangible relevance to their users should be. We're in a period in which the IoT pendulum is still swinging. Initially, it swung toward "smart for smar...
@ThingsExpo has been named the Top 5 Most Influential Internet of Things Brand by Onalytica in the ‘The Internet of Things Landscape 2015: Top 100 Individuals and Brands.' Onalytica analyzed Twitter conversations around the #IoT debate to uncover the most influential brands and individuals driving the conversation. Onalytica captured data from 56,224 users. The PageRank based methodology they use to extract influencers on a particular topic (tweets mentioning #InternetofThings or #IoT in this ...
In an era of historic innovation fueled by unprecedented access to data and technology, the low cost and risk of entering new markets has leveled the playing field for business. Today, any ambitious innovator can easily introduce a new application or product that can reinvent business models and transform the client experience. In their Day 2 Keynote at 19th Cloud Expo, Mercer Rowe, IBM Vice President of Strategic Alliances, and Raejeanne Skillern, Intel Vice President of Data Center Group and ...
SYS-CON Events announced today that Niagara Networks will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Niagara Networks offers the highest port-density systems, and the most complete Next-Generation Network Visibility systems including Network Packet Brokers, Bypass Switches, and Network TAPs.
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, will discuss the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
Virgil consists of an open-source encryption library, which implements Cryptographic Message Syntax (CMS) and Elliptic Curve Integrated Encryption Scheme (ECIES) (including RSA schema), a Key Management API, and a cloud-based Key Management Service (Virgil Keys). The Virgil Keys Service consists of a public key service and a private key escrow service. 

Fact is, enterprises have significant legacy voice infrastructure that’s costly to replace with pure IP solutions. How can we bring this analog infrastructure into our shiny new cloud applications? There are proven methods to bind both legacy voice applications and traditional PSTN audio into cloud-based applications and services at a carrier scale. Some of the most successful implementations leverage WebRTC, WebSockets, SIP and other open source technologies. In his session at @ThingsExpo, Da...
Fifty billion connected devices and still no winning protocols standards. HTTP, WebSockets, MQTT, and CoAP seem to be leading in the IoT protocol race at the moment but many more protocols are getting introduced on a regular basis. Each protocol has its pros and cons depending on the nature of the communications. Does there really need to be only one protocol to rule them all? Of course not. In his session at @ThingsExpo, Chris Matthieu, co-founder and CTO of Octoblu, walk you through how Oct...
SYS-CON Events announced today that Embotics, the cloud automation company, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Embotics is the cloud automation company for IT organizations and service providers that need to improve provisioning or enable self-service capabilities. With a relentless focus on delivering a premier user experience and unmatched customer support, Embotics is the fas...
The Internet of Things (IoT), in all its myriad manifestations, has great potential. Much of that potential comes from the evolving data management and analytic (DMA) technologies and processes that allow us to gain insight from all of the IoT data that can be generated and gathered. This potential may never be met as those data sets are tied to specific industry verticals and single markets, with no clear way to use IoT data and sensor analytics to fulfill the hype being given the IoT today.
@ThingsExpo has been named the Top 5 Most Influential M2M Brand by Onalytica in the ‘Machine to Machine: Top 100 Influencers and Brands.' Onalytica analyzed the online debate on M2M by looking at over 85,000 tweets to provide the most influential individuals and brands that drive the discussion. According to Onalytica the "analysis showed a very engaged community with a lot of interactive tweets. The M2M discussion seems to be more fragmented and driven by some of the major brands present in the...
WebRTC has had a real tough three or four years, and so have those working with it. Only a few short years ago, the development world were excited about WebRTC and proclaiming how awesome it was. You might have played with the technology a couple of years ago, only to find the extra infrastructure requirements were painful to implement and poorly documented. This probably left a bitter taste in your mouth, especially when things went wrong.
The Quantified Economy represents the total global addressable market (TAM) for IoT that, according to a recent IDC report, will grow to an unprecedented $1.3 trillion by 2019. With this the third wave of the Internet-global proliferation of connected devices, appliances and sensors is poised to take off in 2016. In his session at @ThingsExpo, David McLauchlan, CEO and co-founder of Buddy Platform, discussed how the ability to access and analyze the massive volume of streaming data from millio...
SYS-CON Events announced today that Pulzze Systems will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Pulzze Systems, Inc. provides infrastructure products for the Internet of Things to enable any connected device and system to carry out matched operations without programming. For more information, visit http://www.pulzzesystems.com.