Security Analytics with Big Data: New Events and New Approaches

So why are we looking at big data, and what problems can we expect it to solve that we couldn’t before? Most SIEM platforms struggle to keep up with emerging needs for two reasons. The first is that threat data does not come neatly packaged from traditional sources, such as syslog and netflow events. There are many different types of data, data feeds, documents, and communications protocols that contain diverse clues to a data breaches or ongoing attacks. We see clear demand to analyze a broader data set in order hopes of detecting advanced attacks. The second issue is that many types of analysis, correlation, and enrichment are computationally demanding. Much like traditional multi-dimensional data analysis platforms, crunching the data takes horsepower. More data is being generated; add more types of data we want, and multiply that by additional analysess – and you get a giant gap between what you need to do and what you can presently do. Our last post considered what big data is and how NoSQL database architectures inherently address several of the SIEM pain points. In fact, the 3Vs (Volume, Velocity, & Variety) of big data coincide closely with three of the main problems faced by SIEM systems today: scalability, performance, and effectiveness. This is why big data is such an important advancement for SIEM. Volume and velocity problems are addressed by clustering systems to divide load across many commodity servers, and variability through the inherent flexibility of big data / NoSQL. But of course there is more to it. Analysis: Looking at More Two of the most serious problems with current SIEM solutions are that they struggle with the amount of data to be managed, and they cannot deal with the “data velocity” of near-real-time events. Additionally, they need to accept and parse new and diverse data types to support new types of analysis. There are many different types of event data, any of which might contain clues to security threats. Common data types include: Human-readable data: There is a great deal of data which humans can process easily, but which is much more difficult for machines – including blog comments and Twitter feeds. Tweets, discussion fora, Facebook posts, and other types of social media are all valuable for threat intelligence. Some attacks are coordinated in fora, which means companies want to monitor these fora for warnings of possible or imminent attacks, and perhaps even details of the attacks. Some botnet command and control (C&C) communications occur through social media, so there is potential to detect infected machines through this traffic. Telemetry feeds: Cell phone geolocation, lists of sites serving malware, mobile device IDs, HR feeds of employee status, and dozens of other real-time data feeds denote changes in status, behavior, and risk profiles. Some of these feeds are analyzed as the stream of events is captured, while others are collected and analyzed for new behaviors. There are many different use cases but security practitioners, observing how effectively retail organizations are able to predict customer buying behavior, are seeking the same insight into threats. Financial data: We were surprised to learn how many customers use financial data purchased from third parties to help detect fraud. The use cases we heard centered around SIEM for external attacks against web services, but they were also analyzing financial and buying history to predict misuse and account compromise. Contextual data: This is anything that makes other data more meaningful. Contextual data might indicate automated processes rather than human behavior – a too-fast series of web requests, for example, might indicate a bot rather than a human customer. Contextual data also includes risk scores generated by arbitrary analysis of metadata, and detection of odd or inappropriate series of actions. Some is simply collected from a raw event source while other data is derived through analysis. As we improve our understanding of where to look for attack and breach cluse, we will leverage new sources of data and examine them in new ways. SIEM generates some contextual data today, but collection of a broader variety of data enables better analysis and enrichment. Identity and Personas: Today many SIEMs link with directory services to identify users. The goal is to link a human user to their account name. With cloud services, mobile devices, distributed identity stores, identity certificates, and two-factor identity schemes, it has become much harder to link human beings to account activity. As authentication and authorization facilities become more complex, SIEM must connect to and analyze more and different identity stores and logs. Network Data: Some of you are saying “What? I thought all SIEMs looked at network flow data!” Actually, some do but others don’t. Some collect and alert on specific known threats, but only a tiny portion of that passes down the wire. Cheap storage makes it feasible to store more network events and perform behavioral computation on general network trends, service usage, and other pre-computed aggregate views of network traffic. In the future we may be able to include all data. Each of these examples demonstrates what will be possible in the short term. In the long term we may record any and all useful or interesting data. If we can link in data sets that provide a different views or help us make better decisions, we will. We already collect many of these data types, but we have been missing the infrastructure to analyze them meaningfully. Analysis: Doing It Better One limitation of many SIEM platforms is their dependence on relational databases. Even if you strip away relational constructs that limit insertion performance, they still rely on a SQL language with traditional language processors. The fundamental relational database architecture was designed and optimized for relational queries. Flexibility is severely limited by SQL – statements always include FROM and WHERE clauses, and we have a limited number of comparison operators for searching. At a high level we may have Java support, but the actual queries still devolve down to SQL statements. SQL may be a trusty

Read Post

API Gateways: Security Enabling Innovation [New Series]

So why are we talking about this? Because APIs are becoming the de facto service interface – not only for cloud and mobile, but for just about every type of service. The need for security around these APIs is growing, which is why we have seen a rush of acquisitions to fill security product gaps. In what felt like a couple weeks Axway acquired Vordel, CA acquired Layer7, and Intel acquired Mashery. The acquirers all stated these steps were to accommodate security requirements stemming from steady adoption of APIs and associated web services. Our goal for this paper is to help you understand the challenges of securing APIs and to evaluate technology alternatives so you can make informed decisions about current trends in the market. We will start our discussion by mentioning what’s at stake, which should show why certain features are necessary. API gateways have a grand and audacious goal: enablement. Getting developers the tools, data, and functionality they need to realize the mobile, social, cloud and other use cases the enterprise wants to deliver. There is a tremendous amount of innovation in these spaces today, and the business goal is get to market ASAP. At the same time, security is not a nice-to-have – it’s a hard requirement. After all, the value of mobile, social, and cloud applications is in mixing enterprise functionality inside and outside the enterprise. And riding along is an interesting mix of user personas: customers, employees, and corporate identities, all mingling together in the same pool. API gateways must implement real security policies and protocols to protect enterprise services, brands, and identity. This research paper will examine current requirements and technical trends in API security. API gateways are not sexy. They do not generate headlines like cloud, mobile, and big data. But the APIs are the convergence point for all these trends, and the crux of IT innovation today. We all know cloud services scale almost too well to be real, at a price that seems to good to be true. But the APIs are part of what makes them so scalable and cheap. Of course open, API-driven, multi-tenant environments bring new risks along with their new potentials. As Netflix security architect Jason Chan says, securing your app on Amazon Cloud is like rock climbing – Amazon gives you a rope and belays you, but you are on the rock face. You are the one at risk. How do you manage that risk? API gateways play a central role in limiting the cloud’s attack surface and centralizing policy enforcement. Mobile apps pose similar risks in an entirely different technical environment. There is endless amount hype about iOS and Android security. But where are the breaches? On the server side. Why? Because attackers are pragmatic, and that’s where the data is. Mobile apps have vulnerabilities that attackers can go after one by one, but a breach of the server-side APIs exposes the whole enterprise enchilada. Say it with me in your best Taco Bell Chihuahua accent: The whole enchilada! Like cloud applications, API gateways need to reduce the enterprise’s risk by limiting attack surface. And mobile apps use web services differently than other enterprise applications, communications are mostly asynchronous, and the identity tokens are different too – expect to see less SAML or proprietary SSO, and more OAuth and OpenID Connect. API gateways address the challenges raised by these new protocols and interactions. APIs are an enabling technology, linking new and old applications together into a unified service model. But while cloud, mobile, and other innovations drive radical changes in the data center, one thing remains the same: the speed at which business wants to deploy new services. Fast. Faster! Yesterday, if possible. This makes developer enablement supremely important, and is why we need to weave security into the fabric of development – if it is not integrated at a fundamental level, security gets be removed as an impediment to shipping. The royal road is to things that make it easy for developers to understand how to build and deploy an app, grok the interfaces and data, and quickly provision developers and their app users to login – this is how IT shops are organizing teams, projects, and tech stacks. The DMZ has gone the way of the dodo. API gateways are about enabling developers to build cloud, mobile, and social apps on enterprise data, layered over existing IT systems. Third-party cloud services, mobile devices, and work-from-anywhere employees have destroyed (or at least completely circumvented) the corporate IT ‘perimeter’ – the ‘edge’ of the network has so many holes it no longer forms a meaningful boundary. And this trend, fueled by the need to connect in-house and third-party services, is driving the new model. API gateways curate APIs, provision access to users and developers, and facilitate key management. For security this is the place to focus – to centralize policy enforcement, implement enterprise protocols and standards, and manage the attack surface. This paper will explore the following API gateway concepts in detail. The content will be developed and posted to the Securosis blog for vetting by the developer and security community. As always, we welcome your feedback – both positive and negative. Our preliminary outline is: Access Provisioning: We will discuss developer access provisioning, streamlining access to tools and server support, user and administrator provisioning, policy provisioning and management, and audit trails to figure out who did what. Developer Tools: We will discuss how to maintain and manage exposed services, a way to catalogue services, client integration, build processes, and deployment support. Key Management: This post will discuss creating keys, setting up a key management service, key and certificate verification, and finally the key management lifecycle (creation, revocation, rotation/updating). Implementation: Here we get into the meat of this series. We will discuss exposing APIs and parameters, URL whitelisting, proper parameter parsing, and some deployment options that effect security. Buyers Guide: We will wrap this series with a brief buyers guide to help you understand the differences between implementations, as well as key considerations when establishing your evaluation priorities. We will also cover

Read Post

Friday Summary: June 7, 2013

I haven’t been writing much over the past few weeks because I took a few weeks with the family back in Boulder. The plan was to work in the mornings, do fun mountain stuff in the afternoons with the kids, and catch up with friends in the evenings. But the trip ended up turning into a bit of medical tourism when a couple bugs nailed us on day one. For the record, I can officially state that microbrews do not seem to cure viruses. But the research continues… It was actually great to get back home and catch up as best we could under the circumstances. My work suffered but we managed to hit a major chunk of the to-do list. For the kids I think the highlight was me waking up, noticing it was raining, and bundling the family up to the Continental Divide to chase snow. We bounced along an unpaved trail road in the rain, keeping one eye on the temperature and the other on our altitude, until the wet stuff turned into the white stuff. Remember, we live in Phoenix – when it started dumping right when we hit the trailhead, with enough accumulation for snowmen and angels, I was in Daddy heaven. For me, aside from generally catching up with people (and setting a PR in the Bolder Boulder 10K), another highlight was grabbing lunch with some rescue friends and then hanging out in the new headquarters with the kids for a couple hours. It has been a solid 7-8 years since I was on a call, but back at the Cage, surrounded by the gear I used to rely on and vehicles I used to drive, it all came back. Surprisingly little has changed, and I was really hoping the pager would go off so I might hitch along on a call. Er… then again, I’m not sure you are allowed to respond with lights and sirens when kids are in the back in their car seats. There is an intensity to the rescue community that even the security community doesn’t quite match. Shared sweat and blood in risky conditions, as I wrote about in The Magazine. That doesn’t mean it’s all one big lovefest, and there’s no shortage of personal and professional drama, but the bonds formed are intense and long-lasting. And the toys? Oh, man, you can’t beat the toys. That part of my life is on hold for a while as I focus on kids and the company, but it’s comforting to know that not only is it still there, it is still very familiar too. On to the Summary: Webcasts, Podcasts, Outside Writing, and Conferences Adrian’s Dark Reading article on Database DoS. Favorite Securosis Posts David Mortman: New Google disclosure policy is quite good. Adrian Lane: Mobile Security Breaches. Astute, concise analysis from Mogull. Rich: Security Analytics with Big Data: New Events, New Approaches. Adrian is killing it with this series. Other Securosis Posts API Gateways: Security Enabling Innovation [New Series]. Matters Requiring Attention: 100 million or so. Apple Expands Gatekeeper. Incite 6/5/2013: Working in the House. Oracle adopts Trustworthy Computing practices for Java. A CISO needs to be a business person? No kidding… Security Analytics with Big Data: Defining Big Data. LinkedIn Rides the Two-Factor Train. Security Surrender. Finally! Lack of Security = Loss of Business. Network-based Malware Detection 2.0: Scaling NBMD. Friday Summary: May 31, 2013. Evernote Business Edition Doubles up on Authentication. Favorite Outside Posts David Mortman: Data Skepticism. Adrian Lane: NSA Collects Verizon Customer Calls. Interesting read, but not news. We covered this trend in 2008. The question was why the government gave immunity to telecoms for spying on us, and we now know: because they were doing it for the government. Willingly or under duress is the current question. Rich: Why we need to stop cutting down security’s tall poppies. Refreshing perspective. Research Reports and Presentations Email-based Threat Intelligence: To Catch a Phish. Network-based Threat Intelligence: Searching for the Smoking Gun. Understanding and Selecting a Key Management Solution. Building an Early Warning System. Implementing and Managing Patch and Configuration Management. Defending Against Denial of Service (DoS) Attacks. Securing Big Data: Security Recommendations for Hadoop and NoSQL Environments. Tokenization vs. Encryption: Options for Compliance. Pragmatic Key Management for Data Encryption. The Endpoint Security Management Buyer’s Guide. Top News and Posts Democratic Senator Defends Phone Spying, And Says It’s Been Going On For 7 Years. Expert Finds XSS Flaws on Intel, HP, Sony, Fujifilm and Other Websites. Whom the Gods Would Destroy, They First Give Real-time Analytics. Apple Updates OS X, Safari. Original Bitcoin Whitepaper. Unrelenting AWS Growth. Not security related, but the most substantive cloud adoption numbers I have seen. Note that the X axis of that graph is logarithmic – not linear! StillSecure acquired. Microsoft, US feds disrupt Citadel botnet network. Blog Comment of the Week This week’s best comment goes to Andy, in response to LinkedIn Rides the Two-Factor Train. This breaks the LinkedIn App for Windows phone. But who uses Windows phone, besides us neo-Luddites who refuse to buy into the Apple ecosystem? Share:

Read Post

Totally Transparent Research is the embodiment of how we work at Securosis. It’s our core operating philosophy, our research policy, and a specific process. We initially developed it to help maintain objectivity while producing licensed research, but its benefits extend to all aspects of our business.

Going beyond Open Source Research, and a far cry from the traditional syndicated research model, we think it’s the best way to produce independent, objective, quality research.

Here’s how it works:

  • Content is developed ‘live’ on the blog. Primary research is generally released in pieces, as a series of posts, so we can digest and integrate feedback, making the end results much stronger than traditional “ivory tower” research.
  • Comments are enabled for posts. All comments are kept except for spam, personal insults of a clearly inflammatory nature, and completely off-topic content that distracts from the discussion. We welcome comments critical of the work, even if somewhat insulting to the authors. Really.
  • Anyone can comment, and no registration is required. Vendors or consultants with a relevant product or offering must properly identify themselves. While their comments won’t be deleted, the writer/moderator will “call out”, identify, and possibly ridicule vendors who fail to do so.
  • Vendors considering licensing the content are welcome to provide feedback, but it must be posted in the comments - just like everyone else. There is no back channel influence on the research findings or posts.
    Analysts must reply to comments and defend the research position, or agree to modify the content.
  • At the end of the post series, the analyst compiles the posts into a paper, presentation, or other delivery vehicle. Public comments/input factors into the research, where appropriate.
  • If the research is distributed as a paper, significant commenters/contributors are acknowledged in the opening of the report. If they did not post their real names, handles used for comments are listed. Commenters do not retain any rights to the report, but their contributions will be recognized.
  • All primary research will be released under a Creative Commons license. The current license is Non-Commercial, Attribution. The analyst, at their discretion, may add a Derivative Works or Share Alike condition.
  • Securosis primary research does not discuss specific vendors or specific products/offerings, unless used to provide context, contrast or to make a point (which is very very rare).
    Although quotes from published primary research (and published primary research only) may be used in press releases, said quotes may never mention a specific vendor, even if the vendor is mentioned in the source report. Securosis must approve any quote to appear in any vendor marketing collateral.
  • Final primary research will be posted on the blog with open comments.
  • Research will be updated periodically to reflect market realities, based on the discretion of the primary analyst. Updated research will be dated and given a version number.
    For research that cannot be developed using this model, such as complex principles or models that are unsuited for a series of blog posts, the content will be chunked up and posted at or before release of the paper to solicit public feedback, and provide an open venue for comments and criticisms.
  • In rare cases Securosis may write papers outside of the primary research agenda, but only if the end result can be non-biased and valuable to the user community to supplement industry-wide efforts or advances. A “Radically Transparent Research” process will be followed in developing these papers, where absolutely all materials are public at all stages of development, including communications (email, call notes).
    Only the free primary research released on our site can be licensed. We will not accept licensing fees on research we charge users to access.
  • All licensed research will be clearly labeled with the licensees. No licensed research will be released without indicating the sources of licensing fees. Again, there will be no back channel influence. We’re open and transparent about our revenue sources.

In essence, we develop all of our research out in the open, and not only seek public comments, but keep those comments indefinitely as a record of the research creation process. If you believe we are biased or not doing our homework, you can call us out on it and it will be there in the record. Our philosophy involves cracking open the research process, and using our readers to eliminate bias and enhance the quality of the work.

On the back end, here’s how we handle this approach with licensees:

  • Licensees may propose paper topics. The topic may be accepted if it is consistent with the Securosis research agenda and goals, but only if it can be covered without bias and will be valuable to the end user community.
  • Analysts produce research according to their own research agendas, and may offer licensing under the same objectivity requirements.
  • The potential licensee will be provided an outline of our research positions and the potential research product so they can determine if it is likely to meet their objectives.
  • Once the licensee agrees, development of the primary research content begins, following the Totally Transparent Research process as outlined above. At this point, there is no money exchanged.
  • Upon completion of the paper, the licensee will receive a release candidate to determine whether the final result still meets their needs.
  • If the content does not meet their needs, the licensee is not required to pay, and the research will be released without licensing or with alternate licensees.
  • Licensees may host and reuse the content for the length of the license (typically one year). This includes placing the content behind a registration process, posting on white paper networks, or translation into other languages. The research will always be hosted at Securosis for free without registration.

Here is the language we currently place in our research project agreements:

Content will be created independently of LICENSEE with no obligations for payment. Once content is complete, LICENSEE will have a 3 day review period to determine if the content meets corporate objectives. If the content is unsuitable, LICENSEE will not be obligated for any payment and Securosis is free to distribute the whitepaper without branding or with alternate licensees, and will not complete any associated webcasts for the declining LICENSEE. Content licensing, webcasts and payment are contingent on the content being acceptable to LICENSEE. This maintains objectivity while limiting the risk to LICENSEE. Securosis maintains all rights to the content and to include Securosis branding in addition to any licensee branding.

Even this process itself is open to criticism. If you have questions or comments, you can email us or comment on the blog.