We are pleased to release our updated white paper on big data security: Securing Hadoop: Security Recommendations for Hadoop Environments. Just about everything has changed in the four years since we published the original. Hadoop has solidified its position as the dominant big data platform, by constantly advancing in function and scale. While the ability to customize a Hadoop cluster to suit diverse needs has been its main driver, the security advances make Hadoop viable for enterprises. Whether embedded directly into Hadoop or deployed as add-on modules, services like identity, encryption, log analysis, key management, cluster validation, and fine-grained authorization are all available. Our goal for this research paper is first to introduce these technologies to IT and security teams, and also to help them assemble these technologies into an coherent security strategy.
This research project provides a high-level overview of security challenges for big data environments. From there we discuss security technologies available for the Hadoop ecosystem, and then sketch out a set of recommendations to secure big data clusters. Our recommendations map threats and compliance requirements directly to supporting technologies to facilitate your selection process. We outline how these tactical responses work within the security architectures which firms employ, tailoring their approaches to the tools and technical talent on hand.
Finally, we would like to thank Hortonworks and Vormetric for licensing this research. Without firms who appreciate our work enough to license our content, we could not bring you quality research free! We hope you find this research helpful in understanding big data and its associated security challenges.
You can download a free copy of the white paper from our research library, or grab a copy directly: Securing Hadoop: Security Recommendations for Hadoop Environments (PDF).
Reader interactions
One Reply to “Securing Hadoop: Security Recommendations for Hadoop [New Paper]”
Sentry, and the Cloudera EDH versions of Manager, Hue, Impala, Search, and Navigator, provide a much more security-rich approach than what HDP or other Hadoop platforms allow for. Put simply, Apache Ranger in HDP is significantly less mature than Sentry in Cloudera EDH; Falcon less mature compared to Navigator.
Cloudera has zNcrypt from Gazzang to provide encryption at rest for data blocks as well as files. For additional protection, zNcrypt also uses process-based ACLs and keys. HDP can add on Rhino, but again, not as mature. Even IBM, Intel, and AWS directly integrate data-at rest capabilities in their Hadoop offerings. What of HDFS alternatives for data-at rest encryption, e.g., GlusterFS, Isilon, NetApp, et al?
Great paper, but a bit Ambari-specific.