“The world is one big data problem.” – Andrew McAfee, associate director of the Center for Digital Business at MIT Sloan
Though Data as a topic has been close to my heart, it was often a subject I would not deal much with given my preoccupation with applications, middleware, cloud computing & DevOps. However I grabbed the chance to teach a Hadoop course in 2012 and it changed the way I looked at data – not merely an enabler but as the true oil of business. Fast forward to 2016, I have almost completed an amazing and enriching year at Hortonworks. It is a good time for retrospection about how Big Data is transforming businesses across the Fortune 500 landscape. Thus, I present what is not merely the ‘Art of the Possible’ but ‘Business Reality’ – distilled insights from an year of working with real world customers. Companies pioneering Big Data into commercial applications to drive shareholder value & customer loyalty.
Illustration – The Megatrends helping enterprises derive massive value from Big Data
Please find presented the six megatrends that will continue to drive Big Data into enterprise business & IT architectures for the foreseeable future.
- The Internet of Anything (IoAT) – The rise of the machines has been well documented but enterprises have just begun waking up to the possibilities in 2016. The paradigm of harnessing IoT data by leveraging Big Data techniques has begun to gain industry wide adoption & cachet. For example in the manufacturing industry, data is being gathered from a wide variety of sensors that are distributed geographically along factory locations running 24×7. Predictive maintenance strategies that pull together sensor data, prognostics are critical to efficiency & also to optimize the business. In other verticals like healthcare & insurance, massive data volumes are now being reliably generated from diverse sources of telemetry such as patient monitoring devices as well as human manned endpoints at hospitals. In transportation, these devices include cars in the consumer space, trucks & other field vehicles, geolocation devices. Others include field machinery in oil exploration & server logs across IT infrastructure. In the personal consumer space, personal fitness devices like FitBit, Home & Office energy management sensors etc. All of this constitutes the trend that Gartner terms the Digital Mesh. The Mesh really is built from coupling machine data these with the ever growing social media feeds, web clicks, server logs etc.The Digital Mesh leads to an interconnected information deluge which encompasses classical IoT endpoints along with audio, video & social data streams. This leads to huge security challenges and opportunity from a business perspective for forward looking enterprises (including Governments). Applications that are leveraging Big Data to ingest, connect & combine these disparate feeds into one holistic picture of an entity – whether individual or institution – are clearly beginning to differentiate themselves. IoAT is starting to be a huge part of digital transformation initiatives with more usecases emerging in 2017 across industry verticals.
- The Emergence of Unified Architectures – The onset of Digital Architectures in enterprise businesses implies the ability to drive continuous online interactions with global consumers/customers/clients or patients. The goal is not just provide engaging visualization but also to personalize services clients care about across multiple modes of interaction. Mobile applications first begun forcing the need for enterprise to begin supporting multiple channels of interaction with their consumers. For example Banking now requires an ability to engage consumers in a seamless experience across an average of four to five channels – Mobile, eBanking, Call Center, Kiosk etc. Healthcare is a close second where caregivers expect patient, medication & disease data at their fingertips with a few finger swipes on an iPad app. What Big Data brings to the equation beyond it’s strength in data ingest & processing is a unified architecture. For instance, MapReduce is the original framework for writing applications that process large amounts of structured and unstructured data stored in the Hadoop Distributed File System (HDFS). Apache Hadoop YARN opened Hadoop to other data processing engines (e.g. Apache Spark/Storm) that can now run alongside existing MapReduce jobs to process data in many different ways at the same time. The result is that ANY kind of application processing can be run inside a Hadoop runtime – batch, realtime, interactive or streaming.
- Consumer 360 –Mobile applications first begun forcing the need for enterprise to begin supporting multiple channels of interaction with their consumers. For example Banking now requires an ability to engage consumers in a seamless experience across an average of four to five channels – Mobile, eBanking, Call Center, Kiosk etc. The healthcare industry stores patient data across multiple silos – ADT (Admit Discharge Transfer) systems, medication systems, CRM systems etc. Data Lakes provide an ability to visualize all of the patients data in one place thus improving outcomes. The Digital Mesh (covered above) only exacerbates this semantic gap in user experiences as information consumers navigate applications as they consume services across the mesh. A mesh that is both multi-channel as well as one that needs a 360 degree customer view across all these engagement points. Applications developed in 2016 and beyond must take a 360 degree based approach to ensuring a continuous client experience across the spectrum of endpoints and the platforms that span them from a Data Visualization standpoint. Every serious business needs to provide a unified view of a customer across tens of product lines and geographies.
- Machine Learning, Data Science & Predictive Analytics – Most business problems are data challenges and an approach centered around data analysis helps extract meaningful insights from data thus helping the business It is a common capability now for many enterprises to possess the capability to acquire, store and process large volumes of data using a low cost approach leveraging Big Data and Cloud Computing. At the same time the rapid maturation of scalable processing techniques allows us to extract richer insights from data. What we commonly refer to as Machine Learning – a combination of of econometrics, machine learning, statistics, visualization, and computer science – extract valuable business insights hiding in data and builds operational systems to deliver that value.Data Science has evolved to a new branch called “Deep Neural Nets” (DNN). DNN Are what makes possible the ability of smart machines and agents to learn from data flows and to make products that use them even more automated & powerful. Deep Machine Learning involves the art of discovering data insights in a human-like pattern. The web scale world (led by Google and Facebook) have been vocal about their use of Advanced Data Science techniques and the move of Data Science into Advanced Machine Learning. Data Science is an umbrella concept that refers to the process of extracting business patterns from large volumes of both structured, semi structured and unstructured data. It is emerging the key ingredient in enabling a predictive approach to the business. Data Science & it’s applications across a range of industries are covered in the blogpost http://www.vamsitalkstech.com/?p=1846
- Visualization – Mobile applications first begun forcing the need for enterprise to begin supporting multiple channels of interaction with their consumers. For example Banking now requires an ability to engage consumers in a seamless experience across an average of four to five channels – Mobile, eBanking, Call Center, Kiosk etc. The average enterprise user is also familiar with BYOD in the age of self service. The Digital Mesh only exacerbates this gap in user experiences as information consumers navigate applications as they consume services across a mesh that is both multi-channel as well as provides Customer 360 across all these engagement points.While information management technology has grown at a blistering pace, the human ability to process and comprehend numerical data has not. Applications being developed in 2016 are beginning to adopt intelligent visualization approaches that are easy to use,highly interactive and enable the user to manipulate corporate & business data using their fingertips – much like an iPad app. Tools such as intelligent dashboards, scorecards, mashups etc are helping change a visualization paradigms that were based on histograms, pie charts and tons of numbers. Big Data improvements in data lineage, quality are greatly helping the visualization space.
- DevOps – Big Data powered by Hadoop has now evolved into a true application architecture ecosystem as mentioned above. The 30+ components included in an enterprise grade platform like the Hortonworks Data Platform (HDP) include APIs (Application Programming Interfaces) to satisfy every kind of data need that an application could have – streaming, realtime, interactive or batch. Couple that with improvements in predictive analytics. In 2016, enterprise developers leveraging Big Data have been building scalable applications with data as a first class citizen. Organizations using DevOps are already reaping the rewards as they are able to streamline, improve and create business processes to reflect customer demand and positively affect customer satisfaction. Examples abound in the Webscale world (Netflix, Pinterest, and Etsy) but we now have existing Fortune 1000 companies in verticals like financial services, healthcare, retail and manufacturing who are benefiting from Big Data & DevOps.Thus, 2016 will be the year when Big Data techniques are no longer be the preserve of classical Information Management teams but move to the umbrella application development area which encompasses the DevOps and Continuous Integration & Delivery (CI-CD) spheres.
One of DevOps chief goal’s is to close the long-standing gap between the engineers who develop and test IT capability and the organizations that are responsible for deploying and maintaining IT operations. Using traditional app dev methodologies, it can take months to design, test and deploy software. No business today has that much time—especially in the age of IT consumerization and end users accustomed to smart phone apps that are updated daily. The focus now is on rapidly developing business applications to stay ahead of competitors that can better harness the Big Data business capabilities. The micro services architecture approach advocated by DevOps combines the notion of autonomous, cooperative yet loosely coupled applications built as a conglomeration of business focused services is a natural fit for the Digital Mesh. The most important additive and consideration to micro services based architectures in 2016 are Analytics Everywhere.
The Final Word –
We have all heard about the growth of data volumes & variety. 2016 is perhaps the first year where forward looking business & technology executives have begin capturing commercial value from the data deluge by balancing analytics with creative user experience.
Thus, modern data applications are making Big Data ubiquitous. Rather than existing as back-shelf tools for the monthly ETL run or for reporting, these modern applications can help industry firms incorporate data into every decision they make. Applications in 2016 and beyond are beginning to recognize that Analytics are pervasive, relentless, realtime and thus embedded into our daily lives.