Welcome!

Industrial IoT Authors: Peter Silva, Liz McMillan, Elizabeth White, Kevin Jackson, Chris Colosimo

Related Topics: @BigDataExpo, @CloudExpo, @ThingsExpo

@BigDataExpo: Blog Feed Post

Aggregated Data Dilemma | @BigDataExpo #BigData #Analytics #DataScience

Valuable performance and behavioral nuances can be buried in the aggregated data

Okay, I am weird (tell me something that I don’t know, say most of my friends).  For Christmas I wanted a Nike Apple Watch to go with my existing FitBit and Garmin fitness trackers (I look sort of like a cyborg in the photo below…which is always cool).

While I was intrigued by the ability to do all sorts of cool things on the Apple Watch (like take a phone call and talk into my wrist watch like Dick Tracy), the thing that most intrigued me was the ability to buy third-party apps that could yield detailed exercise and health data.  I was hoping that this detailed exercise and health data could help me understand what effect particular behaviors or activities (or lack of particular behaviors and activities) were having on my overall health.

Why is this important to me?  You can thank articles like “Unexpected Heart Attack Triggers” for my health and exercise anxiety.  The article highlighted several things that can trigger a heart attack including:

  • Lack of sleep (definitely an issue, especially when I’m traveling so much)
  • Migraine Headaches (how can you work in technology and not have headaches)
  • Cold Weather (need to find more clients in warmer weather)
  • Big, Heavy Meals (with the exception of Chipotle, right?)
  • Getting Out of Bed in the Morning (see, I knew that was a big danger!!)
  • Alcohol (just like to drink a beer now and then)
  • Coffee (I drink Chai Tea Lattes, that’s technically not coffee, and I know that I shouldn’t admit that I drink Chai Tea Lattes)

So there are many items on that above list that could trigger a heart attack, and I enjoy many of the things on that list (like sleeping and eating and the occasional beer).  Consequently, I thought I’d put my data science experience to work to monitor my exercise and diet behaviors and predict potential health outcomes.

Personal Fitness Analytics
I tested the downloadable data from each of the three devices. The Fitbit offered the easiest way to download my fitness data (and I have TONS of useful fitness and diet tracking suggestions if anyone at Fitbit, Garmin or Apple ever read this blog!!). The problem with the fitness data is that I can only get daily level data (see Table 1).

Table 1:  Daily Fitness Tracking Data

I can add more external data to the aggregated fitness data (e.g., days of the week, days when I travel, how much I travel on those travel days) to come up with some simple plots.

For example, Figure 2 shows a visual correlation between the calories that I burn per step and the days that I travel.  My assumption is that I burn more calories per step when I am doing something that requires more exertion (like running or climbing steps), so it makes sense that on days when I am traveling, I have less opportunities for highly exertive activities.

Figure 2:  How Many Calories I Burn Per Step When Traveling

While this information is “interesting,” unfortunately, data at the aggregated daily level is not actionable.  If I had more detailed or granular fitness data, I’d like to chart what happens to my heart rate (and related stress levels):

  • During an airplane flight
  • When racing through an airport to catch a connecting flight
  • Waking up very early in the morning while traveling
  • Immediately after eating a large meal
  • While I’m doing my taxes (I hate doing my taxes)

The problem is that the data provided by my fitness band is aggregated to a level that is not actionable.  If I had my fitness data at 5 or 10-minute intervals, then I could more easily spot unusual health outcomes and determine (and eventually predict?) what behaviors (e.g., flying in an airplane, eating large meals, heavy exercise exertion, waking up extremely early) might be causing health concerns.

Power of Granular Data
Big Data and data science are all about granular data because valuable performance and behavioral nuances can be buried in the aggregated data.  For example, the chart in Figure 3 shows how additional performance nuances are being uncovered as we transition from a 5-minute to a 1-minute and finally to a 5-second interval in the capture of the performance data.

Figure 3:  Performance Nuances Uncovered in Granular Data

As the data gets more granular, the behavioral and performance nuances buried in the data start to surface. Data at the 5 minute and 1 minute intervals in Figure 3 tell you very little. Aggregated data is the anti-data science. Data at the 5-second interval highlights some potential performance concerns.  In this example, data at the 5-second interval starts to become actionable.

For example, I might notice too sedentary of a heart rate whenever I sit too long on a cross-country flight or my stress level jumping whenever I get another “flight delayed” message while trying to catch a connecting flight. I might then learn to perform some in-seat exercises and walking around during those long flights, or practicing controlled breathing and some simple yoga when enduring yet another flight delay (SFO airport does have a yoga room, and now I know why).

Preparing for an IoT World of Granular Data
Understanding the challenges of capturing and analyzing real-time granular machine and device-generated data will become even more critical as we move into the Internet of Things (IOT), where hundreds of sensors are kicking off tens, hundreds or even thousands of data points per minute.  This will force two specific challenges upon those of us coming from the more traditional human-generated big data world:

  • Real-time data capture and compression
  • Real-time analytics at the edge

For my fitness focus, I might need to expand my Personal Fitness Analysis to capture and analyze more of this detailed data in (near) real-time so that I can become aware of behaviors that are hurting or improving my health and fitness.  Ultimately, my goal is to change my behaviors, but I need to understand (and quantify?) what behaviors lead to desirable health and fitness outcomes (e.g., improved blood pressure, lower weight, less stress).

The post Aggregated Data Dilemma appeared first on InFocus Blog | Dell EMC Services.

Read the original blog entry...

More Stories By William Schmarzo

Bill Schmarzo, author of “Big Data: Understanding How Data Powers Big Business”, is responsible for setting the strategy and defining the Big Data service line offerings and capabilities for the EMC Global Services organization. As part of Bill’s CTO charter, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He’s written several white papers, avid blogger and is a frequent speaker on the use of Big Data and advanced analytics to power organization’s key business initiatives. He also teaches the “Big Data MBA” at the University of San Francisco School of Management.

Bill has nearly three decades of experience in data warehousing, BI and analytics. Bill authored EMC’s Vision Workshop methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements, and co-authored with Ralph Kimball a series of articles on analytic applications. Bill has served on The Data Warehouse Institute’s faculty as the head of the analytic applications curriculum.

Previously, Bill was the Vice President of Advertiser Analytics at Yahoo and the Vice President of Analytic Applications at Business Objects.

@ThingsExpo Stories
Keeping pace with advancements in software delivery processes and tooling is taxing even for the most proficient organizations. Point tools, platforms, open source and the increasing adoption of private and public cloud services requires strong engineering rigor - all in the face of developer demands to use the tools of choice. As Agile has settled in as a mainstream practice, now DevOps has emerged as the next wave to improve software delivery speed and output. To make DevOps work, organization...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.
What sort of WebRTC based applications can we expect to see over the next year and beyond? One way to predict development trends is to see what sorts of applications startups are building. In his session at @ThingsExpo, Arin Sime, founder of WebRTC.ventures, will discuss the current and likely future trends in WebRTC application development based on real requests for custom applications from real customers, as well as other public sources of information,
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 20th Cloud Expo, which will take place on June 6-8, 2017 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 add...
TechTarget storage websites are the best online information resource for news, tips and expert advice for the storage, backup and disaster recovery markets. By creating abundant, high-quality editorial content across more than 140 highly targeted technology-specific websites, TechTarget attracts and nurtures communities of technology buyers researching their companies' information technology needs. By understanding these buyers' content consumption behaviors, TechTarget creates the purchase inte...
With the introduction of IoT and Smart Living in every aspect of our lives, one question has become relevant: What are the security implications? To answer this, first we have to look and explore the security models of the technologies that IoT is founded upon. In his session at @ThingsExpo, Nevi Kaja, a Research Engineer at Ford Motor Company, will discuss some of the security challenges of the IoT infrastructure and relate how these aspects impact Smart Living. The material will be delivered i...
In his session at @ThingsExpo, Eric Lachapelle, CEO of the Professional Evaluation and Certification Board (PECB), will provide an overview of various initiatives to certifiy the security of connected devices and future trends in ensuring public trust of IoT. Eric Lachapelle is the Chief Executive Officer of the Professional Evaluation and Certification Board (PECB), an international certification body. His role is to help companies and individuals to achieve professional, accredited and worldw...
My team embarked on building a data lake for our sales and marketing data to better understand customer journeys. This required building a hybrid data pipeline to connect our cloud CRM with the new Hadoop Data Lake. One challenge is that IT was not in a position to provide support until we proved value and marketing did not have the experience, so we embarked on the journey ourselves within the product marketing team for our line of business within Progress. In his session at @BigDataExpo, Sum...
Your homes and cars can be automated and self-serviced. Why can't your storage? From simply asking questions to analyze and troubleshoot your infrastructure, to provisioning storage with snapshots, recovery and replication, your wildest sci-fi dream has come true. In his session at @DevOpsSummit at 20th Cloud Expo, Dan Florea, Director of Product Management at Tintri, will provide a ChatOps demo where you can talk to your storage and manage it from anywhere, through Slack and similar services ...
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
SYS-CON Events announced today that Ocean9will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Ocean9 provides cloud services for Backup, Disaster Recovery (DRaaS) and instant Innovation, and redefines enterprise infrastructure with its cloud native subscription offerings for mission critical SAP workloads.
SYS-CON Events announced today that Linux Academy, the foremost online Linux and cloud training platform and community, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Linux Academy was founded on the belief that providing high-quality, in-depth training should be available at an affordable price. Industry leaders in quality training, provided services, and student certification passes, its goal is to c...
SYS-CON Events announced today that Telecom Reseller has been named “Media Sponsor” of SYS-CON's 20th International Cloud Expo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. Telecom Reseller reports on Unified Communications, UCaaS, BPaaS for enterprise and SMBs. They report extensively on both customer premises based solutions such as IP-PBX as well as cloud based and hosted platforms.
SYS-CON Events announced today that Loom Systems will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 2015, Loom Systems delivers an advanced AI solution to predict and prevent problems in the digital business. Loom stands alone in the industry as an AI analysis platform requiring no prior math knowledge from operators, leveraging the existing staff to succeed in the digital era. With offices in S...
SYS-CON Events announced today that T-Mobile will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on ...
SYS-CON Events announced today that Cloudistics, an on-premises cloud computing company, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloudistics delivers a complete public cloud experience with composable on-premises infrastructures to medium and large enterprises. Its software-defined technology natively converges network, storage, compute, virtualization, and management into a ...
SYS-CON Events announced today that CA Technologies has been named “Platinum Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY, and the 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CA Technologies helps customers succeed in a future where every business – from apparel to energy – is being rewritten by software. From ...
SYS-CON Events announced today that Infranics will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Since 2000, Infranics has developed SysMaster Suite, which is required for the stable and efficient management of ICT infrastructure. The ICT management solution developed and provided by Infranics continues to add intelligence to the ICT infrastructure through the IMC (Infra Management Cycle) based on mathemat...
Now that the world has connected “things,” we need to build these devices as truly intelligent in order to create instantaneous and precise results. This means you have to do as much of the processing at the point of entry as you can: at the edge. The killer use cases for IoT are becoming manifest through AI engines on edge devices. An autonomous car has this dual edge/cloud analytics model, producing precise, real-time results. In his session at @ThingsExpo, John Crupi, Vice President and Eng...
In the enterprise today, connected IoT devices are everywhere – both inside and outside corporate environments. The need to identify, manage, control and secure a quickly growing web of connections and outside devices is making the already challenging task of security even more important, and onerous. In his session at @ThingsExpo, Rich Boyer, CISO and Chief Architect for Security at NTT i3, will discuss new ways of thinking and the approaches needed to address the emerging challenges of securit...