Industrial IoT Authors: Yeshim Deniz, Liz McMillan, Lori MacVittie, Pat Romanski, William Schmarzo

Related Topics: Containers Expo Blog, Industrial IoT, Microservices Expo, IoT User Interface

Containers Expo Blog: Blog Feed Post

The Many Places You Can Go with SSD Caching Software

SSDs are a disruptive technology in that they are clearly changing the enterprise storage market

Disruptive technology is a term used to describe an idea or invention that typically disrupts an existing market, often completely displacing an earlier technology.  Sometimes disruptive is great (e.g. digital cameras and cell phones), and sometimes disruptive is not so great (e.g. laser video disks).  SSDs are a disruptive technology in that they are clearly changing the enterprise storage market.  The verdict is still out on just how disruptive SSDs will be; as is the case with many disruptive technologies, it is going to take some time to figure out exactly what to do with SSDs.  But a clear leader in the employment of SSDs is to use them as cache, which takes advantage of their incredibly high speeds, while minimizing the cost and hassle of implementation.

One of the key design decisions during the early development of VeloBit HyperCache was figuring out the optimal location of an SSD-based cache.  By “location,” I am referring to the placement of SSD within a computer’s storage architecture.  In order to understand VeloBit’s place in the storage stack, it is first necessary to understand the basic layout of a typical computer storage system.

Figure 1 shows a simplified IT system architecture model where the application software sits atop the operating system (OS) and corresponding file system.  The file system interfaces to HDDs and SSD using a block device driver.  The block device driver performs the actual IO with the HDDs and SSDs.  Hard disks in enterprise environments typically sit behind storage controllers, which contain their own (typically volatile) cache, along with a array of disks.  SSDs can also sit behind these storage controllers, although the VeloBit model allows you to directly attach an SSD to your server, without the need to purchase expensive specialty SSD arrays.

Additionally, in typical deployments, file systems sit atop a volume virtualization layer. In Linux, this is typically the LVM, or Linux Volume Manager.  Using this technology allows system administrators the flexibility of storage virtualization (such as RAID, snapshots, thin provisioning, and disk expansion) without incurring the added cost of SAN-side virtualization platforms.  Finally, it’s possible to deploy SSD caching as a part of this virtualization layer, as will be discussed later in this article.

SSD Caching SW locations resized 600

Figure 1: Simplified IT System Architecture

Back to deploying SSD as cache - there are two components to any SSD caching system: (1) the SSD itself, and (2) SSD caching software.  The SSD can either be installed in the server directly, or inside the storage array.  SSD caching software is more flexible, and be installed in several different locations:

  1. In the storage controller
  2. Between the volume virtualization layer and the block device layer
  3. As a part of the volume virtualization layer itself
  4. On top of the file system

These locations are marked on the diagram in Figure 1 with the corresponding circled numbers.

Let’s talk about the pros and cons of each location.

SSD Caching Software In The Storage Controller
Location 1 in Figure 1 shows the SSD caching software residing in the storage controller.  These controllers contain dedicated processors to manage all IO operations to the storage array, and algorithms can be implemented to determine what data should reside on the slower, higher-capacity disks, and what data should be sent to SSD for high-speed access. This solution:
•    Is easy to install, assuming you’re already purchasing a storage array
•    Provides very high performance
•    Can be incredibly expensive, since it comes along with an enterprise storage system
•    Is usually hardware dependent and typically vendor specific (results in vendor lock-in)

Examples of this solution would be SSD cards from High Point and LSI.

SSD Caching Software Above the Block Layer
A device driver is typically software developed to control specific hardware at a very low level.  In this case, I am grouping the SSD in with SSD caching software and calling the whole thing a device driver because the combination of the SSD and SSD caching software acts as a transparent device driver for an application accessing primary storage.  This is shown as the dashed line in location 2 in Figure 1. The SSD/Caching software combination is:
•    Easy to install – this implementation is completely transparent to both the lower-level storage, as well as all file systems and applications.
•    High performance
•    Very hard to develop – being such a low-level driver requires intimate knowledge of the operating system and block device drivers.
•    Hardware independent – since there’s no direct interaction with storage (storage access is abstracted by the block device driver), this solution works with any primary storage and SSD.
•    Applications independent – by inserting itself just above the block device driver, this type of SSD cache requires no file system or application changes.

Velobit HyperCache SSD caching software is an example of this solution.

SSD Cache In the Volume Virtualization Layer
The use of SSD caching software at location 3 in Figure 1 means the SSD caching software works inside the volume virtualization layer. This requires changes in the existing hard drive and SSD map configuration.  Using SSD caching software at this location is:
•    Moderate performance – performance can be limited by the virtualization layer itself.
•    Easy to develop – this caching method uses already-existing tools, inside the virtualization layer.
•    Very difficult to install – since the cache is built as a new virtual volume, your entire volume management needs to change to use the cache.
•    Hardware independent

FlashCache from Facebook is an example of this solution.

SSD Caching Software On Top Of The File System
The use of SSD caching software at location 4 in Figure 1 means the SSD caching software works at a higher level – the file system level. By sitting above, or inside, the file system, a user can specify precisely which files (and by association, applications) to cache.  Using SSD caching software at this location is:
•    Low performance
•    Very easy to develop
•    Very easy to install
•    Hardware independent
•    Environment specific – many databases don’t use a file system, to achieve maximum possible performance; therefore, file system-based caching won’t work in these environments.  Additionally, while most Windows installations use NTFS, many file systems exist for Linux, and it isn’t practical to support all available platforms.

CacheWorks from Nevex is an example of this solution.

The table below summarizes the various installations of SSD caching software discussed above.  If vendor lock-in is not a concern, running SSD caching software in the SSD controller offers the best combination of features and performance.  However, vendor lock-in is expensive and limits the options for product choices.  Using SSD caching software in conjunction with the SSD as a device driver for the HDD offers all the benefits of installing in the SSD controller without the problem of vendor lock-in.

Table SSD Caching SW locations

Read the original blog entry...

More Stories By Peter Velikin

Peter Velikin has 12 years of experience creating new markets and commercializing products in multiple high tech industries. Prior to VeloBit, he was VP Marketing at Zmags, a SaaS-based digital content platform for e-commerce and mobile devices, where he managed all aspects of marketing, product management, and business development. Prior to that, Peter was Director of Product and Market Strategy at PTC, responsible for PTC’s publishing, content management, and services solutions. Prior to PTC, Peter was at EMC Corporation, where he held roles in product management, business development, and engineering program management.

Peter has an MS in Electrical Engineering from Boston University and an MBA from Harvard Business School.

@ThingsExpo Stories
@ThingsExpo has been named the Top 5 Most Influential M2M Brand by Onalytica in the ‘Machine to Machine: Top 100 Influencers and Brands.' Onalytica analyzed the online debate on M2M by looking at over 85,000 tweets to provide the most influential individuals and brands that drive the discussion. According to Onalytica the "analysis showed a very engaged community with a lot of interactive tweets. The M2M discussion seems to be more fragmented and driven by some of the major brands present in the...
The Internet of Things (IoT), in all its myriad manifestations, has great potential. Much of that potential comes from the evolving data management and analytic (DMA) technologies and processes that allow us to gain insight from all of the IoT data that can be generated and gathered. This potential may never be met as those data sets are tied to specific industry verticals and single markets, with no clear way to use IoT data and sensor analytics to fulfill the hype being given the IoT today.
SYS-CON Events announced today that Transparent Cloud Computing (T-Cloud) Consortium will exhibit at the 19th International Cloud Expo®, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. The Transparent Cloud Computing Consortium (T-Cloud Consortium) will conduct research activities into changes in the computing model as a result of collaboration between "device" and "cloud" and the creation of new value and markets through organic data proces...
SYS-CON Events announced today that MathFreeOn will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. MathFreeOn is Software as a Service (SaaS) used in Engineering and Math education. Write scripts and solve math problems online. MathFreeOn provides online courses for beginners or amateurs who have difficulties in writing scripts. In accordance with various mathematical topics, there are more tha...
More and more brands have jumped on the IoT bandwagon. We have an excess of wearables – activity trackers, smartwatches, smart glasses and sneakers, and more that track seemingly endless datapoints. However, most consumers have no idea what “IoT” means. Creating more wearables that track data shouldn't be the aim of brands; delivering meaningful, tangible relevance to their users should be. We're in a period in which the IoT pendulum is still swinging. Initially, it swung toward "smart for smar...
@ThingsExpo has been named the Top 5 Most Influential Internet of Things Brand by Onalytica in the ‘The Internet of Things Landscape 2015: Top 100 Individuals and Brands.' Onalytica analyzed Twitter conversations around the #IoT debate to uncover the most influential brands and individuals driving the conversation. Onalytica captured data from 56,224 users. The PageRank based methodology they use to extract influencers on a particular topic (tweets mentioning #InternetofThings or #IoT in this ...
SYS-CON Events announced today that SoftNet Solutions will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. SoftNet Solutions specializes in Enterprise Solutions for Hadoop and Big Data. It offers customers the most open, robust, and value-conscious portfolio of solutions, services, and tools for the shortest route to success with Big Data. The unique differentiator is the ability to architect and ...
SYS-CON Events announced today that Niagara Networks will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Niagara Networks offers the highest port-density systems, and the most complete Next-Generation Network Visibility systems including Network Packet Brokers, Bypass Switches, and Network TAPs.
SYS-CON Events announced today that Embotics, the cloud automation company, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Embotics is the cloud automation company for IT organizations and service providers that need to improve provisioning or enable self-service capabilities. With a relentless focus on delivering a premier user experience and unmatched customer support, Embotics is the fas...
In an era of historic innovation fueled by unprecedented access to data and technology, the low cost and risk of entering new markets has leveled the playing field for business. Today, any ambitious innovator can easily introduce a new application or product that can reinvent business models and transform the client experience. In their Day 2 Keynote at 19th Cloud Expo, Mercer Rowe, IBM Vice President of Strategic Alliances, and Raejeanne Skillern, Intel Vice President of Data Center Group and ...
Virgil consists of an open-source encryption library, which implements Cryptographic Message Syntax (CMS) and Elliptic Curve Integrated Encryption Scheme (ECIES) (including RSA schema), a Key Management API, and a cloud-based Key Management Service (Virgil Keys). The Virgil Keys Service consists of a public key service and a private key escrow service. 

Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, will discuss the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
Fifty billion connected devices and still no winning protocols standards. HTTP, WebSockets, MQTT, and CoAP seem to be leading in the IoT protocol race at the moment but many more protocols are getting introduced on a regular basis. Each protocol has its pros and cons depending on the nature of the communications. Does there really need to be only one protocol to rule them all? Of course not. In his session at @ThingsExpo, Chris Matthieu, co-founder and CTO of Octoblu, walk you through how Oct...
Fact is, enterprises have significant legacy voice infrastructure that’s costly to replace with pure IP solutions. How can we bring this analog infrastructure into our shiny new cloud applications? There are proven methods to bind both legacy voice applications and traditional PSTN audio into cloud-based applications and services at a carrier scale. Some of the most successful implementations leverage WebRTC, WebSockets, SIP and other open source technologies. In his session at @ThingsExpo, Da...
In past @ThingsExpo presentations, Joseph di Paolantonio has explored how various Internet of Things (IoT) and data management and analytics (DMA) solution spaces will come together as sensor analytics ecosystems. This year, in his session at @ThingsExpo, Joseph di Paolantonio from DataArchon, will be adding the numerous Transportation areas, from autonomous vehicles to “Uber for containers.” While IoT data in any one area of Transportation will have a huge impact in that area, combining sensor...
If you had a chance to enter on the ground level of the largest e-commerce market in the world – would you? China is the world’s most populated country with the second largest economy and the world’s fastest growing market. It is estimated that by 2018 the Chinese market will be reaching over $30 billion in gaming revenue alone. Admittedly for a foreign company, doing business in China can be challenging. Often changing laws, administrative regulations and the often inscrutable Chinese Interne...
SYS-CON Events announced today that Pulzze Systems will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Pulzze Systems, Inc. provides infrastructure products for the Internet of Things to enable any connected device and system to carry out matched operations without programming. For more information, visit http://www.pulzzesystems.com.
In the next forty months – just over three years – businesses will undergo extraordinary changes. The exponential growth of digitization and machine learning will see a step function change in how businesses create value, satisfy customers, and outperform their competition. In the next forty months companies will take the actions that will see them get to the next level of the game called Capitalism. Or they won’t – game over. The winners of today and tomorrow think differently, follow different...
One of biggest questions about Big Data is “How do we harness all that information for business use quickly and effectively?” Geographic Information Systems (GIS) or spatial technology is about more than making maps, but adding critical context and meaning to data of all types, coming from all different channels – even sensors. In his session at @ThingsExpo, William (Bill) Meehan, director of utility solutions for Esri, will take a closer look at the current state of spatial technology and ar...
The Open Connectivity Foundation (OCF), sponsor of the IoTivity open source project, and AllSeen Alliance, which provides the AllJoyn® open source IoT framework, today announced that the two organizations’ boards have approved a merger under the OCF name and bylaws. This merger will advance interoperability between connected devices from both groups, enabling the full operating potential of IoT and representing a significant step towards a connected ecosystem.