Welcome!

Industrial IoT Authors: Pat Romanski, William Schmarzo, Elizabeth White, Stackify Blog, Yeshim Deniz

Related Topics: Industrial IoT, @DXWorldExpo

Industrial IoT: Article

To Heck with 'Big Data,' 'Little Data' Is the Problem Most Face | @BigDataExpo

'Big Data' gets the press, but 'little data' is the big problem

To Heck with “Big Data”, “Little Data” is the Problem Most Face

"Big data" gets all the press - but for the vast majority of people who work with data, it's the proliferation of "little data" that impacts us the most. What do I mean by little data?  I'm referring to the proliferation of various SaaS and Cloud-based applications, on-premises applications, databases, spreadsheets, log files, data files and so forth. Many organizations are plagued with multiple instances of the same applications or multiple applications from different vendors that do essentially the same thing.    These are the applications and data that run today's enterprise - and they're a mess.

A week doesn't go by without some major vendor doing a press release that discusses unlocking the value in the mountains of structured and unstructured data that companies love to accumulate. For most of us, though, it's not getting all that value out of the Petabytes that cause us heartburn - it's getting answers out of the megabytes or gigabytes that are distributed across handfuls, dozens or even hundreds of unintegrated systems, applications and data sources.

As I mentioned recently on ebizQ's Integration Edge, the average enterprise uses at least 397 Cloud/SaaS applications in addition to all of the on-premises applications in play.  Add to that the various data stores (for example SQL Databases), and it's not unrealistic to say that a typical enterprise has around 1,000 different data-related systems of one sort or another.  Apart from the concerns for security, compliance and backup/recovery, one obvious question should stand out: how can I "get value" out of all that data - data locked up in all those different locations and different formats.

Traditionally these types of problems were solved with DBAs, programmers and business analysts (with liberal amounts of "money" and "time" tossed in).  This approach works.  It's time-tested.  It's also expensive, and not particularly scalable or flexible.  Sometimes it can take years to actually get a working solution.

Not every organization has the luxury of taking the "traditional approach" to solving the little data problem.  With the pressure to deliver results faster, better and less expensively, some businesses have found new and innovative ways to get the value they need from this sea of disparate data.

For most of us, a few systems - perhaps a dozen constitutes this disparate data mess.  But what would it be like to have to make sense out of thousands of different data sources with different data semantics?  And to know that "if things go well", there might be twice as many in a year or two?  I recently met up with such a person - and it's quite a compelling story.

This past week, I was fortunate enough to have coffee and "talk about data issues" for a few hours with Jason Haskins - a Data Architect at Alchemy Systems, a rapidly growing international company that delivers innovative technologies and services for the global food industry that increase productivity, ensure regulatory compliance, foster safe working environments, and produce quality products.  In short, they help ensure the safety and quality of our food supply chain.

I've met quite a few data architects in my day, but Jason actually is an architect - he has 2 architecture degrees - including graduate work at Columbia University and a Masters from the University of Texas at Austin.  Talking data architecture with an architect (in the A.I.A-sense of the word) - that's a new one for me.  What made it compelling was the way that Jason drew parallels between the challenges of Architecture (especially workplace-design architecture) and data - and how his training as a design architect taught him to examine highly complex systems at the systemic level, with usability, flexibility and scalability always at the top of the stack.

Jason's rather unusual background proved valuable when he inherited a wildly disparate and rapidly growing data infrastructure featuring the dreaded "A times B times C times D" problem - 500 clients with 2,000 installations.  Further complicating the situation - many of these installations require customized Alchemy Systems solutions and data models to support multiple product lines, multiple market segments and geographies that can span multiple countries and many regulatory environments.

Because of this rapid growth, Alchemy Systems found itself in a situation where it was simply unable to get value from all of this disparate data.  This was not a "big data" problem - it was a deluge of little data.

Jason HaskinsJason's architectural training led him to propose a "meta" layer above the Alchemy Systems applications and data - something that would not interfere with existing data models, would meet the needs of Alchemy's customer success managers and would provide the kind of flexibility and scalability to support Alchemy's rapid growth - essentially a software bridge between the Customer, Alchemy's development team and the customer-success group at Alchemy.  One of Jason's key design requirements was to provide a level of abstraction across the different data sets, yet no loss of resolution - a rather challenging goal as "abstraction" and "resolution" are often at odds with each other.

There are commercially available products out there that Jason might have turned to in order to solve the data mapping problem that he faced.  But Jason's belief was that "usability" extended far beyond just a data mapping layer - he wanted to deliver an integrated solution that united the customer data and also provided data visualization capabilities to the customer managers at Alchemy.  And it was his belief that Alchemy would be best served by a product which would do the data management and the data visualization all in one single product stack

What Jason found was that there are some products out there that integrate and unite data very well.  And there are other products out there that do data visualization very well.  But finding a single product that did both of those well turned out to be a challenge.

At South by Southwest Interactive, after attending a session on data visualization and integration, Jason got involved in a discussion with Gaute Solaas - an Austin-based technologist/CEO who´s company IQumulus was developing a Cloud-based data management technology called Flux.

In a conversation with Gaute, he reflected on his interactions with Jason, "the more I spoke with Jason about the problems he was facing, the more I realized that our new product needed to solve the data visualization problem as well as the data management problem in a single Cloud-based product that provides a business intelligence solution for large quantities and varieties of small data.  So we worked with Alchemy Systems on the product requirements and quickly delivered the enhancements they needed for a pilot project."

Jason drew up plans for an Account Management Dashboard pilot project using Flux that would allow customer managers to view various important success indicators and statistics for their clients and was able to deliver the project in 30 days, an impressive feat.  I asked Jason a top-of-mind question - asking him if he was nervous using a new software product, to which he replied, "I found something flexible and it fit into the paradigm I was working with. Gaute and his company share my belief  that the meaningful integration and proper utilization of numerous smaller and distributed data sets is a problem currently not adequately addressed by existing products."

Gaute added some additional perspective, "the real challenge for most organizations is not only managing the various distributed data sources, but also enabling the productive presentation of those data sets to the relevant stakeholders across the enterprise.  Finding ways to do cost-effective data aggregation and presentation in an ever-changing environment is a very challenging thing - it's what we set out to do with the Flux platform".

When asked about the future of data at Alchemy, Jason declined to mention any specifics - but hinted at interesting things to come - "we've created a neutral and adaptable layer with Flux.  It's designed for flexibility and scalability - we can take it as far as we want to".

More Stories By Hollis Tibbetts

Hollis Tibbetts, or @SoftwareHollis as his 50,000+ followers know him on Twitter, is listed on various “top 100 expert lists” for a variety of topics – ranging from Cloud to Technology Marketing, Hollis is by day Evangelist & Software Technology Director at Dell Software. By night and weekends he is a commentator, speaker and all-round communicator about Software, Data and Cloud in their myriad aspects. You can also reach Hollis on LinkedIn – linkedin.com/in/SoftwareHollis. His latest online venture is OnlineBackupNews - a free reference site to help organizations protect their data, applications and systems from threats. Every year IT Downtime Costs $26.5 Billion In Lost Revenue. Even with such high costs, 56% of enterprises in North America and 30% in Europe don’t have a good disaster recovery plan. Online Backup News aims to make sure you all have the news and tips needed to keep your IT Costs down and your information safe by providing best practices, technology insights, strategies, real-world examples and various tips and techniques from a variety of industry experts.

Hollis is a regularly featured blogger at ebizQ, a venue focused on enterprise technologies, with over 100,000 subscribers. He is also an author on Social Media Today "The World's Best Thinkers on Social Media", and maintains a blog focused on protecting data: Online Backup News.
He tweets actively as @SoftwareHollis

Additional information is available at HollisTibbetts.com

All opinions expressed in the author's articles are his own personal opinions vs. those of his employer.

IoT & Smart Cities Stories
Nicolas Fierro is CEO of MIMIR Blockchain Solutions. He is a programmer, technologist, and operations dev who has worked with Ethereum and blockchain since 2014. His knowledge in blockchain dates to when he performed dev ops services to the Ethereum Foundation as one the privileged few developers to work with the original core team in Switzerland.
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
Whenever a new technology hits the high points of hype, everyone starts talking about it like it will solve all their business problems. Blockchain is one of those technologies. According to Gartner's latest report on the hype cycle of emerging technologies, blockchain has just passed the peak of their hype cycle curve. If you read the news articles about it, one would think it has taken over the technology world. No disruptive technology is without its challenges and potential impediments t...
If a machine can invent, does this mean the end of the patent system as we know it? The patent system, both in the US and Europe, allows companies to protect their inventions and helps foster innovation. However, Artificial Intelligence (AI) could be set to disrupt the patent system as we know it. This talk will examine how AI may change the patent landscape in the years to come. Furthermore, ways in which companies can best protect their AI related inventions will be examined from both a US and...
Bill Schmarzo, Tech Chair of "Big Data | Analytics" of upcoming CloudEXPO | DXWorldEXPO New York (November 12-13, 2018, New York City) today announced the outline and schedule of the track. "The track has been designed in experience/degree order," said Schmarzo. "So, that folks who attend the entire track can leave the conference with some of the skills necessary to get their work done when they get back to their offices. It actually ties back to some work that I'm doing at the University of San...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...