Securosis

Research

Inflection

Hang with me as I channel my inner Kerouac (minus the drugs, plus the page breaks) and go all stream of consciousness. To call this post an “incomplete thought” would be more than a little generous. I believe we are now deep in the early edge of a major inflection point in security. Not one based merely on evolving threats or new compliance regimes, but a fundamental transformation of the practice of security that will make it nearly unrecognizable when we finally emerge on the other side. For the past 5 years Hoff and I have discussed disruptive innovation in our annual RSA presentation. What we are seeing now is a disruptive conflagration, where multiple disruptive innovations are colliding and overlapping. It affects more than security, but that’s the only area about which I’m remotely qualified to pontificate. Perhaps that’s a bit of an exaggeration. All the core elements of what we will become are here today, and there are certain fundamentals that never change, but someone walking into the SOC or CISO role of tomorrow will find more different than the same unless they deliberately blind themselves. Unlike most of what I read out there, I don’t see these changes as merely things we in security are forced to react to. Our internal changes in practice and technology are every bit as significant contributing factors. One of the highlights of my career was once hanging out and having beers with Bruce Sterling. He said that his role as a futurist was to imagine the world 7 years out – effectively beyond the event horizon of predictability. What I am about to describe will occur over the next 5-10 years, with the most significant changes likely occurring in those last 7-10 years, but based on the roots we establish today. So this should be taken as much as science fiction as prediction. The last half of 2012 is the first 6 months of this transition. The end result, in 2022, will be far more change over 10 years than the evolution of the practice of security from 2002 through today. The first major set of disruptions includes the binary supernova of tech – cloud computing and mobility. This combination, in my mind, is more fundamentally disruptive than the initial emergence of the Internet. Think about it – for the most part the Internet was (at a technical level) merely an extension of our existing infrastructure. To this day we have tons of web applications that, through a variety of tiers, connect back to 30+-year-old mainframe applications. Consumption is still mostly tied to people sitting at computers at desks – especially conceptually. Cloud blows up the idea of merely extending existing architectures with a web portal, while mobility advances fundamentally redefine consumption of technology. Can you merely slop your plate of COBOL onto a plate of cloud? Certainly, right as you watch your competitors and customers speed past at relativistic speeds. Our tradition in security is to focus on the risks of these advances, but the more prescient among us are looking at the massive opportunities. Not that we can ignore the risks, but we won’t merely be defending these advances – our security will be defined and delivered by them. When I talk about security automation and abstraction I am not merely paying lip service to buzzwords – I honestly expect them to support new capabilities we can barely imagine today. When we leverage these tools – and we will – we move past our current static security model that relies (mostly) on following wires and plugs, and into a realm of programmatic security. Or, if you prefer, Software Defined Security. Programmers, not network engineers, become the dominant voices in our profession. Concurrently, four native security trends are poised to upend existing practice models. Today we focus tremendous effort on an infinitely escalating series of vulnerabilities and exploits. We have started to mitigate this somewhat with anti-exploitation, especially at the operating system level (thanks to Microsoft). The future of anti-exploitation is hyper segregation. iOS is an excellent example of the security benefits of heavily sandboxing the operating ecosystem. Emerging tools like Bromium and Invincea are applying even more advanced virtualization techniques to the same problem. Bromium goes so far as to effectively virtualize and isolate at a per task level. Calling this mere ‘segregation’ is trite at best. Cloud enables similar techniques at the network and application levels. When the network and infrastructure are defined in software, there is essentially zero capital cost for network and application component segregation. Even this blog, today, runs on a specially configured hyper-segregated server that’s managed at a per-process level. Hyper segregated environments – down, in some cases, to the individual process level – are rapidly becoming a practical reality, even in complex business environments with low tolerance for restriction. Although incident response has always technically been core to any security model, for the most part it was shoved to the back room – stuck at the kids’ table next to DRM, application security, and network segregation. No one wanted to make the case that no matter what we spent, our defenses could never eliminate risk. Like politicians, we were too frightened to tell our executives (our constituency) the truth. Especially those who were burned by ideological execs. Thanks to our friends in China and Eastern Europe (mostly), incident response is on the earliest edge of getting its due. Not the simple expedient of having an incident response plan, or even tools, but conceptually re-prioritizing and re-architecting our entire security programs – to focus as much or more on detection and response as on pure defense. We will finally use all those big screens hanging in the SOC to do more than impress prospects and visitors. My bold prediction? A focus on incident response, on more rapidly detecting and responding to attacker-driven incidents, will exceed our current checklist and vulnerability focused security model, affecting everything from technology decisions to budgeting and staffing. This doesn’t

Share:
Read Post

Securing Big Data: Security Issues with Hadoop Environments

How do I secure “big data”? A simple and common question. But one without a direct answer – simple or otherwise. We know thousands of firms are working on big data projects, from small startups to large enterprises. New technologies enable any company to collect, manage, and analyze incredibly large data sets. As these systems become more common, the repositories are more likely to be stuffed with sensitive data. Only after companies are reliant on “big data” do they ask “How can I secure it?” This question comes up so much, attended by so much interest interest and confusion, that it’s time for an open discussion on big data security. We want to cover several areas to help people get a better handle on the challenges. Specifically, we want to cover three things: Why It’s Different Architecturally: What’s different about these systems, both in how they process information and how they are deployed? We will list some of the specific architectural differences and discuss how how they impact data and database security. Why It’s Different Operationally: We will go into detail on operational security issues with big data platforms. We will offer perspective on the challenges in securing big data and the deficiencies of the systems used to manage it – particularly their lack of native security features. Recommendations and Open Issues: We will outline strategies for securing these data repositories, with tactical recommendations for securing certain facets of these environments. We will also highlight some gaps where no good solutions exist. Getting back to our initial question – how to secure big data – what is so darn difficult about answering? For starters, “What is big data?” Before we can offer advice on securing anything, we need to agree what we’re talking about. We can’t discus big data security without an idea of what “big data” means. But there is a major problem: the term is so overused that it has become almost meaningless. When we talk to customers, developers, vendors, and members of the media, they all have their have their own idea of what “big data” is – but unfortunately they are all different. It’s a complex subject and even the wiki definition fails to capture the essence. Like art, everyone knows it when they see it, but nobody can agree on a definition. Defining Big Data What we know is that big data systems can store very large amounts of data; can manage that data across many systems; and provide some facility for data queries, data consistency, and systems management. So does “big data” mean any giant data repository? No. We are not talking about giant mainframe environments. We’re not talking about Grid clusters, massively parallel databases, SAN arrays, cloud-in-a-box, or even traditional data warehouses. We have had the capability to create very large data repositories and databases for decades. The challenge is not to manage a boatload of data – many platforms can do that. And it’s not just about analysis of very large data sets. Various data management platforms provide the capability to analyze large amounts of data, but their cost and complexity make them non-viable for most applications. The big data revolution is not about new thresholds of scalability for storage and analysis. Can we define big data as a specific technology? Can we say that big data is any Hadoop HDFS/Lustre/Google GFS/shard storage system? No – again, big data is more than managing a big data set. Is big data any MapReduce cluster? Probably not, because it’s more than how you query large data sets. Heck, even PL/SQL subsystems in Oracle can be set up to work like MapReduce. Is big data an application? Actually, it’s all of these things and more. When we talk to developers, the people actually building big data systems and applications, we get a better idea of what we’re talking about. The design simplicity of these these platforms is what attracts developers. They are readily available, and their (relatively) low cost of deployment makes them accessible to a wider range of users. With all these traits combined, large-scale data analysis becomes cost-effective. Big data is not a specific technology – it’s defined more by a collection of attributes and capabilities. Sound familiar? It’s more than a little like the struggle to define cloud computing, so we’ll steal from the NIST cloud computing definition and start with some essential characteristics. We define big data as any data repository with the following characteristics: Handles large amounts (petabyte or more) of data Distributed, redundant data storage Parallel task processing Provides data processing (MapReduce or equivalent) capabilities Central management and orchestration Inexpensive – relatively Hardware agnostic Accessible – both (relatively) easy to use, and available as a commercial or open source product Extensible – basic capabilities can be augmented and altered In a nutshell: big, cheap, and easy data management. The “big data” revolution is built on these three pillars – the ability to scale data stores at greatly reduced cost is makes it all possible. It’s data analytics available to the masses. It may or may not have traditional ‘database’ capabilities (indexing, transactional consistency, or relational mapping). It may or may not be fault tolerant. It may or may not have failover capabilities (redundant control nodes). It may or may not allow complex data types. It may or may not provide real-time results to queries. But big data offers all those other characteristics, and it turns out that they – even without traditional database features – are enough to get useful work done. So does big data mean the Hadoop framework? Yes. The Hadoop framework (e.g. HDFS, MapReduce, YARN, Common) is the poster child for big data, and it offers all the characteristics we outlined. Most big data systems actually use one or more Hadoop components, and extend some or all of its basic functionality. Amazon’s SimpleDB also satisfies the requirements, although it is architected differently than Hadoop. Google’s proprietary BigTable architecture is very similar to Hadoop, but we exclude

Share:
Read Post
dinosaur-sidebar

Totally Transparent Research is the embodiment of how we work at Securosis. It’s our core operating philosophy, our research policy, and a specific process. We initially developed it to help maintain objectivity while producing licensed research, but its benefits extend to all aspects of our business.

Going beyond Open Source Research, and a far cry from the traditional syndicated research model, we think it’s the best way to produce independent, objective, quality research.

Here’s how it works:

  • Content is developed ‘live’ on the blog. Primary research is generally released in pieces, as a series of posts, so we can digest and integrate feedback, making the end results much stronger than traditional “ivory tower” research.
  • Comments are enabled for posts. All comments are kept except for spam, personal insults of a clearly inflammatory nature, and completely off-topic content that distracts from the discussion. We welcome comments critical of the work, even if somewhat insulting to the authors. Really.
  • Anyone can comment, and no registration is required. Vendors or consultants with a relevant product or offering must properly identify themselves. While their comments won’t be deleted, the writer/moderator will “call out”, identify, and possibly ridicule vendors who fail to do so.
  • Vendors considering licensing the content are welcome to provide feedback, but it must be posted in the comments - just like everyone else. There is no back channel influence on the research findings or posts.
    Analysts must reply to comments and defend the research position, or agree to modify the content.
  • At the end of the post series, the analyst compiles the posts into a paper, presentation, or other delivery vehicle. Public comments/input factors into the research, where appropriate.
  • If the research is distributed as a paper, significant commenters/contributors are acknowledged in the opening of the report. If they did not post their real names, handles used for comments are listed. Commenters do not retain any rights to the report, but their contributions will be recognized.
  • All primary research will be released under a Creative Commons license. The current license is Non-Commercial, Attribution. The analyst, at their discretion, may add a Derivative Works or Share Alike condition.
  • Securosis primary research does not discuss specific vendors or specific products/offerings, unless used to provide context, contrast or to make a point (which is very very rare).
    Although quotes from published primary research (and published primary research only) may be used in press releases, said quotes may never mention a specific vendor, even if the vendor is mentioned in the source report. Securosis must approve any quote to appear in any vendor marketing collateral.
  • Final primary research will be posted on the blog with open comments.
  • Research will be updated periodically to reflect market realities, based on the discretion of the primary analyst. Updated research will be dated and given a version number.
    For research that cannot be developed using this model, such as complex principles or models that are unsuited for a series of blog posts, the content will be chunked up and posted at or before release of the paper to solicit public feedback, and provide an open venue for comments and criticisms.
  • In rare cases Securosis may write papers outside of the primary research agenda, but only if the end result can be non-biased and valuable to the user community to supplement industry-wide efforts or advances. A “Radically Transparent Research” process will be followed in developing these papers, where absolutely all materials are public at all stages of development, including communications (email, call notes).
    Only the free primary research released on our site can be licensed. We will not accept licensing fees on research we charge users to access.
  • All licensed research will be clearly labeled with the licensees. No licensed research will be released without indicating the sources of licensing fees. Again, there will be no back channel influence. We’re open and transparent about our revenue sources.

In essence, we develop all of our research out in the open, and not only seek public comments, but keep those comments indefinitely as a record of the research creation process. If you believe we are biased or not doing our homework, you can call us out on it and it will be there in the record. Our philosophy involves cracking open the research process, and using our readers to eliminate bias and enhance the quality of the work.

On the back end, here’s how we handle this approach with licensees:

  • Licensees may propose paper topics. The topic may be accepted if it is consistent with the Securosis research agenda and goals, but only if it can be covered without bias and will be valuable to the end user community.
  • Analysts produce research according to their own research agendas, and may offer licensing under the same objectivity requirements.
  • The potential licensee will be provided an outline of our research positions and the potential research product so they can determine if it is likely to meet their objectives.
  • Once the licensee agrees, development of the primary research content begins, following the Totally Transparent Research process as outlined above. At this point, there is no money exchanged.
  • Upon completion of the paper, the licensee will receive a release candidate to determine whether the final result still meets their needs.
  • If the content does not meet their needs, the licensee is not required to pay, and the research will be released without licensing or with alternate licensees.
  • Licensees may host and reuse the content for the length of the license (typically one year). This includes placing the content behind a registration process, posting on white paper networks, or translation into other languages. The research will always be hosted at Securosis for free without registration.

Here is the language we currently place in our research project agreements:

Content will be created independently of LICENSEE with no obligations for payment. Once content is complete, LICENSEE will have a 3 day review period to determine if the content meets corporate objectives. If the content is unsuitable, LICENSEE will not be obligated for any payment and Securosis is free to distribute the whitepaper without branding or with alternate licensees, and will not complete any associated webcasts for the declining LICENSEE. Content licensing, webcasts and payment are contingent on the content being acceptable to LICENSEE. This maintains objectivity while limiting the risk to LICENSEE. Securosis maintains all rights to the content and to include Securosis branding in addition to any licensee branding.

Even this process itself is open to criticism. If you have questions or comments, you can email us or comment on the blog.