Over the past year, there clearly has been some major movement in trends associated with big data. As we move into the New Year, it’s helpful to explore features that are quickly becoming essentials for organizations that want to get the most value out of their data, and help ensure their big data system isn’t a big hassle.
By now, big data's immense capabilities are well known. Businesses are excited to leverage all of their data, but they still run into roadblocks. They may execute small projects, but not have the necessary privacy and governance features to meet the demands of a production environment. Or they want to collect data from newly connected devices, but can't handle the privacy issues that come with this new data. When businesses compile all their different data types to run analytics and unlock new discoveries, some of their most sensitive information is almost always included. 47% of over 3,500 global security decision-makers surveyed by Forrester (“Big Data Security Strategies For Hadoop Enterprise Data Lake”, April 2016 Forrester Report) were concerned with the risks associated with big data analytics for business decision-making.
Companies need systems that ensure only the right people see the right details of their data, and the technology is maturing to meet that expectation. For the modern big data solution, privacy and governance features are key; they can’t just be afterthoughts. Only recently are the necessary privacy and governance tools being designed into big data solutions from the start.
It’s crucial for big data warehouse solutions to have built-in sophisticated privacy and governance features, such as attribute-based access controls and extensible metadata frameworks. As privacy and governance features rise to meet today’s data-sharing challenges, businesses will not only benefit from big data’s power, but they’ll also use those benefits with greater confidence and security.
Also, look to see more big data activity in the cloud. According to Gartner, the worldwide public cloud services market is expected to grow by 16.5% overall in 2016, with cloud application services (SaaS) seeing a 20.3% growth and the cloud system infrastructure services (IaaS) segment growing by 38.4%. Businesses are choosing cloud-based big data solutions because of the cloud’s flexibility to meet their specific needs. The cloud’s scalability allows companies to start small and scale up (or down) as required. Scaling can occur rapidly and on short notice, making the cloud an ideal resource for meeting an organization’s developing demands, whether massive compute power is needed for a temporary project or for the foreseeable future. Since companies only pay for what they use, this means that the cloud is also an extremely cost-effective solution, gaining popularity with businesses big and small.
Security is also a key focus for all leading cloud providers. Many of them are even able to service the healthcare industry's rigorous privacy demands and support HIPAA and HITRUST requirements. With their expertise, cloud providers can help businesses navigate data protection. Having a partner that can tackle the more complex data security issues enables businesses to focus more resources on getting the value out of their data.
Dr. Paul Terry,
President and CEO,
Companies are also discovering that they don’t have to “go it alone”. "Skills gaps" and uncertainty over "how to get value from Hadoop" are often cited reasons for organizations’ hesitation with big data. In 2016, more businesses are overcoming those inhibitors by opting for turnkey big data systems. A turnkey, or "out of the box," big data solution can relieve some of an IT department’s stress. “Build-it-yourself” Hadoop ecosystems require significant time and resources to get started. Organizations are choosing ready-to-go big data warehouses instead and circumventing many roadblocks, such as the need to hire a brand new team of Hadoop specialists.
Turnkey systems provide businesses with all the benefits they've come to expect from enterprise-grade data management systems, including support from their vendor and regular flow of new features. With turnkey systems, IT can focus on helping their end users run predictive analytics and data science—and deriving those profitable insights—faster, making them an increasingly in-demand choice across industries.
Another notable trend relates to how big data is impacting traditional enterprise data warehouses. Businesses are looking to incorporate big data's high volume, velocity, and variety capabilities into IT infrastructures. Companies are choosing big data systems that co-exist and complement their enterprise data warehouse and BI solutions. Businesses are adding a big data component to their data warehouse solution to tap into big data's ability to ingest unstructured data and complex datasets and make this data accessible to analysts and data scientists. They’re taking advantage of big data technology’s powerful distributed processing capability. An enterprise data warehouse with big data capabilities means relational data can be pulled into the big data warehouse and data scientists can uncover new insights by combining data previously locked in separate silos with new sources of data, such as IoT and social media.
Increasingly, companies are realizing it isn’t an either/or competition between big data and the enterprise data warehouses. Companies can implement what is referred to as a “logical data warehouse” architecture, where relational systems thrive alongside the big data systems that can handle more diverse data types faster and open up new, non-traditional data sources to analytics.
Also expect to see a continuation in big data driving enhanced data science and greater discovery. The power and flexibility of big data and programming frameworks like Apache Spark, are opening new doors and allowing companies to ask questions they simply couldn't before. While a traditional data warehouse can handle analytics on a month's worth of data, big data allows businesses to mine multiple years worth of information or comb through millions of files (and data types) to find those that match highly specific conditions.
Big data creates a richer discovery environment for data scientists. They can combine data from multiple sources to unlock new perspectives and test hypotheses their systems simply couldn't handle before. For instance, businesses can now combine information from every touch point—from website data to social media feeds to customer service calls—to holistically understand customer experience. Having an efficient data warehouse is great, but its big data and the data science discoveries that will enable proactive decisions and truly transform business decisions for the better.
About The Author:
As the President and CEO, Dr. Paul Terry provides the vision and technical leadership at PHEMI. He serves on the Board of Directors for Providence Health Care, advising on its subcommittees for innovation, quality, EMR, and next-generation data strategies in healthcare. Paul also serves on the Board of Directors for Life Sciences BC and Molecular You, and is an advisor to the BC provincial government on next-generation data strategies. He is an adjunct professor in big data at Simon Fraser University and is a partner with Magellan Angel Partners. Paul lectures in technology, strategy, and product management for the MBA program at SFU, and is a sought-after speaker on data innovation and strategy. He is a member of the Big Data Sub-Committee Working Group, the BC Institute for Health Innovation and serves on Genome BC’s Health Strategy Task Force.