Research

By Adrian Lane

Securing_Hadoop_Final_V2.pdfBig data systems have become very popular because they offer a low-cost way to analyze enormous sets of rapidly changing data. But Hadoop, with its incredibly open and vibrant ecosystem, has enabled firms to completely tailor clusters to their business needs. This combination has made Hadoop the most popular big data framework in use today. And as adoption has ramped up, IT and security teams have found themselves tasked with getting a handle on data – and Hadoop cluster – security.

We released first our first security recommendations in 2012, just before the release of YARN. Since then the Hadoop security landscape has changed radically. Today a comprehensive set of technologies is available.

This research paper delves into the fundamental security controls for Hadoop including encryption, isolation, and access controls/identity management. We start by examining the types of problems most firms need to address, matching them against available security tools. From there we branch out into two major areas of concern: high-level architectural considerations and tactical operational options, exploring decision process you need to go through to determine which problems you need to address. We close with a strategic framework for deploying tactical controls into a cohesive security strategy, with key recommendations for keeping Hadoop infrastructure and data secure.

As with all our research papers, we welcome feedback and community participation. If you have comments or you want to see additions, please email us at info at Securosis dot com, or post a comment on this blog. This way we can foster an open dialog with the community. Finally, we would like to thank the companies which have licensed this research and helped us make it available to you free: Hortonworks and Vormetric.

Download the research here: Securing_Hadoop_Final_V2.pdf.