Login  |  Register  |  Contact
Monday, April 25, 2016

Building a Vendor IT Risk Management Program: Ongoing Monitoring and Communication

By Mike Rothman

As we mentioned last post, after you figure out what risk means to your organization, and determine the best way to quantify and rank your vendors in terms that concept of risk, you’ll need to revisit your risk assessment; because security in general, and each vendor’s environment specifically, is dynamic and constantly changing. We also need to address how to deal with vendor issues (breaches and otherwise) – both within your organization, and potentially to customers as well.

Ongoing Monitoring

When keeping tabs on your vendors you need to decide how often to update your assessments of their security posture. In a perfect world you’d like a continuous view of each vendor’s environment, to help you understand your risk at all times. Of course continuous monitoring costs. So part of defining a V(IT)RM program is figuring out the frequency of assessment.

We believe vendors should not all be treated alike. The vendors in your critical risk tier (described in our last post) should be assessed as often as possible. Hopefully you’ll have a way (most likely through third-party services) of continually monitoring their Internet footprint, and alerting you when something changes adversely. We need to caveat that with a warning about real-time alerts. If you are not staffed to deal with real-time alerts, then getting them faster doesn’t help. In other words, if it takes you 3 days to work through your alert queue, getting an alert within an hour cannot reduce your risk much.

Vendors in less risky tiers can be assessed less frequently. An annual self-assessment and a quarterly scan might be enough for them. Again, this depends on your ability to deal with issues and verify answers. If you aren’t going to look at the results, forcing a vendor to update their self-assessment quarterly is just mean, so be honest with yourself when determining the frequency for assessments.

With assessment frequency determined by risk tier, what next? You’ll find adverse changes to the security posture of some vendors. The next step in the V(IT)RM program is to figure out how to deal with these issues.

Taking Action

You got an alert that there is an issue with a vendor, and you need to take action. But what actions can you take, considering the risk posed by the issue and the contractual agreement already in place? We cannot overstate the importance of defining acceptable actions contractually as part of your vendor onboarding process. A critical aspect of setting up and starting your program is ensuring your contracts with vendors support your desired actions when an issue arises.

So what can you do? This list is pretty consistent with most other security processes:

  • Alert: At minimum you’ll want a line of communication open with the vendor to tell them you found an issue. This is no different than an escalation during an incident response. You’ll need to assemble the information you found, and package it up for the vendor to give them as much information as practical. But you need to balance how much time you’re willing to spend helping the vendor against everything else on your to do list.
  • Quarantine: As an interim measure, until you can figure out what happened and your best course of action, you could quarantine the vendor. That could mean a lot of things. You might segment their traffic from the rest of your network. Or scrutinize each transaction coming from them. Or analyze all egress traffic to ensure no intellectual property is leaking. The point is that you’ll need time to figure out the best course of action, and putting the vendor in a proverbial penalty box can buy you that time. This is also contingent on being able to put a boundary around a specific vendor or service provider, which may not be possible, depending on what services they provide.
  • Cut off: There is also the kill switch, which removes vendor access from your systems and likely ends the business relationship. This is a draconian action, but sometimes a vendor presents such risk, and/or doesn’t make the changes you require, so you may not have a choice. As mentioned above, you’ll need to make sure your contract supports this action. Unless you enjoy protracted litigation.

The latter two options impact the flow of business between your organization and the vendor, so you’ll need a process in place internally to determine if and when you quarantine and/or cut off a vendor. This escalation and action plan needs to be defined ahead of time. The rules of engagement, and the criteria to suspend or end a business relationship due to IT risk, need to be established ahead of time. Defined escalations ensure the internal stakeholders are in the loop as you consider flipping the kill switch.

A good rule of thumb is that you don’t want to surprise anyone when a vendor goes into quarantine or is cut off from your systems. If the business decision is made to keep the vendor active in your systems (a decision made well above your pay grade), at least you’ll have documentation that the risk was accepted by the business owner.

Communicating Issues

Once the action plan is defined, documented, and agreed upon, you’ll want to build a communication plan. That includes defining when you’ll notify the vendor and when you’ll communicate the issue internally. As part of the vendor onboarding process you need to define points of contact with the vendor. Do they have a security team you should interface with? Is it their business operations group? You need to know before you run into an issue.

You’ll also want to make sure to have an internal discussion about how much you will support the vendor as they work through any issues you find. If the vendor has an immature security team and/or program, you can easily end up doing a lot of work for them. And it’s not like you have a bunch of time to do someone else’s work, right?

Of course business owners may be unsympathetic to your plight when their key vendor is cut off. That’s why organizational buy-in for criteria for quarantining or cutting a vendor off is critical. The last thing you as a security professional want is to be in a firefight with a business leader over a key vendor. Establish your criteria, and manage to them. If you are overruled and the decision is to make an exception, you can’t do much about that. But at least you will be on record that the decision goes against the policies established within your vendor risk management program.

Breach Exposure

If you have enough vendors you will run into a situation where your vendor suffers a public breach. You will need to factor that specifically into your program, because you may have a responsibility to disclose the third-party breach to your customers as well. First things first: the vendor breach shouldn’t be a surprise to you. A vendor should proactively call their customers when they are breached and customers are at risk. But this is the real world, so we cannot afford to count on them acting correctly. What then?

This comes back to your organization’s Incident Response playbook. You have that documented, right? As described in our Incident Response Fundamentals research, you need to size up the issue, build a team, assess the damage, and then move to contain it. Of course this is a bit different for vendors, because there is a lot you don’t know about their systems. And depending on the vendor’s sophistication, they might not know either.

So (as always) internal communication and keeping senior management appraised of the situation are critical. You need to stay in close contact with the vendor, constantly assess your level of exposure, and decide if and when you need to disclose – to your board, audit committee, and possibly customers.

Also, as described in Our IR fundamentals research, make sure to work through a post-mortem with the vendor to make sure they learned from the experience. If you aren’t satisfied that it won’t happen again, perhaps you need to escalate to the business managers on your side, to reevaluate the relationship in light of the additional risk in doing business with them. Additionally, use this as an opportunity to refine your own process for next time a vendor gets popped.

—Mike Rothman

Friday, April 22, 2016

Building a Vendor IT Risk Management Program: Evaluating Vendor Risk

By Mike Rothman

As we discussed in the first post in this series, whether it’s thanks to increasingly tighter business processes/operations with vendors andtrading partners, or to regulation (especially in finance) you can no longer ignore vendor risk management. So we delved into the structure and mapped out a few key aspects of a VRM program. Of course we are focused on the IT aspects of vendor management, which should be a significant component of a broader risk management approach for your environment.

But that begs the question of how you can actually evaluate the risks of a vendor. What should you be worried about, and how can you gather enough information to make an objective judgement of the risk posed by every vendor? So that’s what we’ll explore in this post.

Risk in the Eye of the Beholder

The first aspect of evaluating vendor risk is actually defining what that risk means to your organization. Yes, that seems self-evident, but you’d be surprised how many organizations don’t document or get agreement on what presents vendor risk, and then wonder why their risk management programs never get anywhere. Sigh.

All the same, as mentioned above, vendor (IT) risk is a component of a larger enterprise risk management program. So first establish the risks of working with vendors. Those risks can be broken up into a variety of buckets, including:

  • Financial: This is about the viability of your vendors. Obviously this isn’t something you can control from an IT perspective, but if a key vendor goes belly up, that’s a bad day for your organization. So this needs to be factored in at the enterprise level, as well as considered from an IT perspective – especially as cloud services and SaaS proliferate. If your Database as a Service vendor (or any key service provider) goes away, for whatever reason, that presents risk to your organization.

  • Operational: You contract with vendors to do something for your organization. What is the risk if they cannot meet those commitments? Or if they violate service level agreements? Again it is enterprise-level risk of the organization, but it also peeks down into the IT world. Do you pack up shop and go somewhere else if your vendor’s service is down for a day? Are your applications and/or infrastructure portable enough to even do that?

  • Security: As security professionals this is our happy place. Or unhappy place, depending on how you feel about the challenges of securing much of anything nowadays. This gets to the risk of a vendor being hacked and losing your key data, impacting availability of your services, and/or allowing an adversary to jump access your networks and systems.

Within those buckets, there are probably a hundred different aspects that present risk to your organization. After defining those buckets of risk, you need to dig into the next level and figure out not just what presents risk, but also how to evaluate and quantify that risk. What data do you need to evaluate the financial viability of a vendor? How can you assess the operational competency of vendors? And finally, what can you do to stay on top of the security risk presented by vendors? We aren’t going to tackle financial or operational risk categories, but we’ll dig into the IT security aspects below.

Ask them

The first hoop most vendors have to jump through is self-assessment. As a vendor to a number of larger organizations, we are very familiar with the huge Excel spreadsheet or web app built to assess our security controls. Most of the questions revolve around your organization’s policies, controls, response, and remediation capabilities.

The path of least resistance for this self-assessment is usually a list of standard controls. Many organizations start with ISO 27002, COBIT, and PCI-DSS. Understand relevance is key here. For example, if a vendor is only providing your organization with nuts and bolts, their email doesn’t present a very significant risk. So you likely want a separate self-assessment tool for each risk category, as we’ll discuss below.

It’s pretty easy to lie on a spreadsheet or web application. And vendors do exactly that. But you don’t have the resources to check everything, so there is a measure of trust, but verify that your need to apply here. Just remember that it’s resource-intensive to evaluate every answer, so focus on what’s important, based on the risk definitions above.

External information

Just a few years ago, if you wanted to assess the security risk of a vendor, you needed to either have an on-site visit or pay for a penetration test to really see what an attacker could do to partners. That required a lot of negotiation and coordination with the vendor, which meant it could only be used for your most critical vendors. And half the time they’d tell you to go pound sand, pointing to the extensive self-assessment you forced them to fill out.

But now, with the introduction of external threat intelligence services, and techniques you can implement yourself, you can get a sense of what kind of security mess your vendors actually are. Here are a few types of relevant data sources:

  • Botnets: Botnets are public by definition, because they use compromised devices to communicate with each other. So if a botnet is penetrated you can see who is connecting to it at the time, and get a pretty good idea of which organizations have compromised devices. That’s exactly how a number of services tell you that certain networks are compromised without ever looking at the networks in question.

  • Spam: If you have a network that is blasting out a bunch of spam, that indicatives an issue. It’s straightforward to set up a number of dummy email accounts to gather spam, and see which networks are used to blast millions of messages a day. If a vendor owns one of those networks, that’s a disheartening indication of their security prowess.

  • Stolen credentials: There are a bunch of forums where stolen credentials are traded, and if a specific vendor shows up with tons of their accounts and passwords for sale, that means their security probably leaves a bit to be desired.

  • Malware distribution/infected hosts: Another indication of security failure is Internet-facing devices which are compromised and then used to either host phishing sites or distribute malware, or both. If a vendor’s Internet site is infected and distributing malware, they likely have no idea what they are doing.

  • Public Breaches: We’ll discuss this later, but if your vendor is a public company or deals with consumers, they have to disclose breaches to their customers. Although you’ve likely gotten kind of numb to yet another breach notification, if it mentions a key vendor, that’s a concern. We’ll discuss what to do when a vendor is breached later in this series.

  • Security Best Practices: There are also other tells that a vendor knows a bit about security. Do they encrypt all traffic to/from their public sites? Do they authenticate their email with technologies like SPF or DKIM? Do they use secure DNS requests? To be clear, these aren’t conclusive indicators, but they can certainly give you a clue to how serious a vendor is about security.

So how do you gather all of this information? You can certainly do it yourself. Set up a bunch of honeypots and dedicate some internal resources to mining through the data. If you have tens of thousands of vendors and are heavily regulated, you might do exactly this. Otherwise you’ll likely rely on an external information provider to perform this analysis for you.

We covered some aspects of these services in our Ecosystem Risk Management Paper, but we’ll quickly summarize here. You need to figure out if you are looking for this vendor to provide a score and ranking of your other vendors, or whether you want the raw data on which vendors have issues (whether with botnets, malware distribution, etc.) to perform your own analysis and draw your own conclusions.

Risk Tiers

To make this kind of program feasible, without requiring another 25 bodies, let’s discuss risk tiering. Larger organizations may have thousands of vendors. It’s hard to consistently perform deep risk analysis of every vendor you do business with. But you also cannot afford only a cursory look at a few key vendors which present significant risk. So you can tier different vendors into separate risk tiers.

We’re simple folks, so we find any more than 3 or 4 tiers unwieldy. One of your first actions, after defining your IT security risk, is to nail down and build consensus on how to tier vendors by risk. Then your analyses and assessments will be based on the risk tier, and not some arbitrary decision on how deeply to look at each vendor. You could use tiers such as: critical, important, and basic. You could call them “unimportant” vendors but that might damage their self-esteem. The names don’t matter – you just need a set of tiers to group them between.

Critical vendors will get the full boat. A self-assessment, a means to externally evaluate their security posture, and possibly a site visit. You’ll scrutinize their self-assessments and have alerts triggered when something changes with them. We’ll go into what options you have to deal with vendor risk our the next post, but for now suffice it to say you’ll be all over your critical vendors, to make sure they are secure and have a plan to address any deficiencies.

Important vendors may warrant a cursory look at the self-assessment and the external evaluation. The security bar might need to be lower for these folks, because they present less risk to your organization. Basic vendors send in their self-assessment, and maybe you perform a simple vulnerability scan on their external web properties, just to check some box on an auditor’s checklist. That’s about all you’ll have time for with these folks.

Could you have more risk tiers? Absolutely. But the amount of work increases exponentially with each additional tier. That’s why we favor only using a handful, knowing that from a risk management standpoint the best bang for your buck will be from focusing on your critical vendors.

Tracking over Time

Obviously security is highly dynamic, so what is secure today might not be tomorrow. Likewise, what was a mess a month ago may not be so bad right now. Yet most vendor risk assessments provide a single point-in-time view, representing what things looked like at that moment. Such an assessment has limited value, because every organization can have the proverbial bad day, and inevitably some data sources provide false positives.

You want to evaluate each partner over time, and track their progress. Have they shown less infected hosts over time? How long does it take them to remediate devices participating in botnets? Has a vendor that traditionally did security well, suddenly started blasting spam and joining a bunch of botnets? Part of defining your vendor (IT) risk management program is figuring out which of the quantitative risk metrics most closely represent real risk to your organization, and need to be tracked and managed over time.

Alerting is also a key aspect of executing on the program. If a critical vendor shows up on a botnet, do you drop everything and address it with the partner? That question provides a nice segue to our next post, which will discuss ongoing monitoring and communication for your vendor (IT) risk management program.

—Mike Rothman

Friday Summary: April 21, 2016

By Adrian Lane

Adrian here.

Starting with the 2008 RSA conference, Rich and Chris Hoff presented each year on the then-current state of cloud services, and predicted where they felt cloud computing was going. This year Mike helped Rich conclude the series with some new predictions, but more importantly they went back to assess the accuracy of previous prognostications. My takeaway is that their predictions for what cloud services would do, and the value they would provide, were pretty well spot on. And in most cases, when a specific tool or product was identified as being critical, they totally missed the mark. Wildly.

Trying to follow cloud services – Azure, AWS, and GCP, to name a few – is like running toward a rapidly-expanding explosion blast radius. Capabilities grow so fast in depth and breadth that you cannot keep pace. Part of our latest change to the Friday Summary is a weekly tools discussion to help you with the best of what we see. And ask IT folks or any development team: tools are a critical part of getting the job done. Adoption rates of Docker show how critical tools are to productivity. Keep in mind that we are not making predictions here – we are keenly aware that we don’t know what tools will be used a year from now, much less three. But we do know many old-school platforms don’t work in more agile situations. That’s why it’s essential to share some of the cool stuff we see, so you are aware of what’s available, and can be more productive.

You can subscribe to only the Friday Summary.

Top Posts for the Week

Tool of the Week

This week’s tool is DynamoDB, a NoSQL database service native to Amazon AWS. A couple years ago when we were architecting our own services on Amazon, we began with comparing RDS and PostgreSQL, and even considered MySQL. After some initial prototyping work we realized that a relational – or quasi-relational, for some MySQL variants – platform really did not offer us any advantages but came some limitations, most notably around the use of JSON data elements and the need to store very small records very quickly. We settled on DynamoDB. Call it confirmation bias, but the more we used it the more we appreciated its capabilities, especially around security.

While a full discussion of DynamoDB’s security capabilities – much less a full platform review – is way beyond the scope of this post, the following are some security features we found attractive:

  • Query Filters: The ability to use filter expressions to simulate masking and label-based controls over what data is returned to the user. The Dynamo query can be consistent for all users, but filter return values after it is run. Filters can work by comparing elements returned from the query, but there is nothing stopping you from filtering on other values associated with the user running the query. And like with relational platforms, you can built your own functions to modify data or behavior depending upon the results.
  • Fine-grained Authorization: Access polices can be used to restrict access to specific functions, and limit the impact of those functions to specific tables by user. As with query filters, you can limit results a specific user sees, or take into account things like geolocation to dynamically alter which data a user can access, using a set of access policies defined outside the application.
  • IAM: The AWS platform offers a solid set of identity and access management capabilities, and group/role based authorization as you’d expect, which all integrates with Dynamo. But it also offers temporary credentials for delegation of roles. The also offer hooks to leverage Google and Facebook identities to cotnrol mobile access to the database. Again, we can choose to punt on some IAM responsibilities for speed of deployment, or we can choose to ratchet down access, depending on how the user authenticated.
  • JSON Support: You can both store and query JSON documents in DynamoDB. Programmers reading this will understand why this is important, as it enables you to store everything from data to code snippets, in a semi-structured way. For security or DevOps folks, consider storing different versions of initialization and configuration scripts for different AMIs, or a list of defining characteristics (e.g., known email addresses, MAC addresses, IP addresses, mobile device IDs, corporate and personal accounts, etc.) for specific users.
  • Logging: As with most NoSQL platforms, logging is integrated. You can set what to log conditionally, and even alert based on log conditions. We are just now prototyping log parsing with Lambda functions, streaming the results to S3 for cheap storage. This should be an easy way to enrich as well.

DynamoDB has security feature parity with relational databases. As someone who has used relational platforms for the better part of 20 years, I can say that most of the features I want are available in one form or another on the major relational platforms, or can be simulated. The real difference is the ease and speed at which I can leverage these functions.

Securosis Blog Posts this Week

Other Securosis News and Quotes

Training and Events

—Adrian Lane

Tuesday, April 19, 2016

How iMessage distributes security to block “phantom devices”

By Rich

Last Friday I spent some time in a discussion with senior members of Apple’s engineering and security teams. I knew most of the technical content but they really clarified Apple’s security approach, much of which they have never explicitly stated, even on background. Most of that is fodder for my next post, but I wanted to focus on one particular technical feature I have never seen clearly documented before; which both highlights Apple’s approach to security, and shows that iMessage is more secure than I thought.

It turns out you can’t add devices to an iCloud account without triggering an alert because that analysis happens on your device, and doesn’t rely (totally) on a push notification from the server. Apple put the security logic in each device, even though the system still needs a central authority. Basically, they designed the system to not trust them.

iMessage is one of the more highly-rated secure messaging systems available to consumers, at least according to the Electronic Frontier Foundation. That doesn’t mean it’s perfect or without flaws, but it is an extremely secure system, especially when you consider that its security is basically invisible to end users (who just use it like any other unencrypted text messaging system) and in active use on something like a billion devices.

I won’t dig into the deep details of iMessage (which you can read about in Apple’s iOS Security Guide), and I highly recommend you look at a recent research paper by Matthew Green and associates at Johns Hopkins University which exposed some design flaws.

Here’s a simplified overview of how iMessage security works:

  • Each device tied to your iCloud account generates its own public/private key pair, and sends the public key to an Apple directory server. The private key never leaves the device, and is protected by the device’s Data Protection encryption scheme (the one getting all the attention lately).
  • When you send an iMessage, your device checks Apple’s directory server for the public keys of all the recipients (across all their devices) based on their Apple ID (iCloud user ID) and phone number.
  • Your phone encrypts a copy of the message to each recipient device, using its public key. I currently have five or six devices tied to my iCloud account, which means if you send me a message, your phone actually creates five or six copies, each encrypted with the public key for one device.
  • For you non-security readers, a public/private keypair means that if you encrypt something with the public key, it can only be decrypted with the private key (and vice-versa). I never share my private key, so I can make my public key… very public. Then people can encrypt things which only I can read using my public key, knowing nobody else has my private keys.
  • Apple’s Push Notification Service (APN) then sends each message to its destination device.
  • If you have multiple devices, you also encrypt and send copies to all your own devices, so each shows what you sent in the thread.

This is a simplification but it means:

  • Every message is encrypted from end to end.
  • Messages are encrypted using keys tied to your devices, which cannot be removed (okay, there is probably a way, especially on Macs, but not easily).
  • Messages are encrypted multiple times, for each destination device belonging to the recipients (and sender), so private keys are never shared between devices.

There is actually a lot more going on, with multiple encryption and signing operations, but that’s the core. According to that Johns Hopkins paper there are exploitable weaknesses in the system (the known ones are patched), but nothing trivial, and Apple continues to harden things. Keep in mind that Apple focuses on protecting us from criminals rather than governments (despite current events). It’s just that at some point those two priorities always converge due to the nature of security.

It turns out that one obvious weakness I have seen mentioned in some blog posts and presentations isn’t actually a weakness at all, thanks to a design decision.

iMessage is a centralized system with a central directory server. If someone could compromise that server, they could add “phantom devices” to tap conversations (or completely reroute them to a new destination). To limit this Apple sends you a notification every time a device is added to your iCloud account.

I always thought Apple’s server detected a new entry and then pushed out a notification, which would mean that if they were deeply compromised (okay, forced by a government) to alter their system, the notification could be faked, but that isn’t how it works. Your device checks its own registry of keys, and pops up an alert if it sees a new one tied to your account.

According to the Johns Hopkins paper, they managed to block the push notifications on a local network which prevented checking the directory and creating the alert. That’s easy to fix, and I expect a fix in a future update (but I have no confirmation).

Once in place that will make it impossible to place a ‘tap’ using a phantom device without at least someone in the conversation receiving an alert. The way the current system works, you also cannot add a phantom recipient because your own devices keep checking for new recipients on your account.

Both those could change if Apple were, say, forced to change their fundamental architecture and code on both the server and device sides. This isn’t something criminals could do, and under current law (CALEA) the US government cannot force Apple to make this kind of change because it involves fundamental changes to the operation of the system.

That is a design decision I like. Apple could have easily decided to push the notifications from the server, and used that as the root authority for both keys and registered devices, but instead they chose to have the devices themselves detect new devices based on new key registrations (which is why the alerts pop up on everything you own when you add or re-add a device). This balances the need for a central authority (to keep the system usable) against security and privacy by putting the logic in the hardware in your pocket (or desk, or tote bag, or… whatever).

I believe FaceTime uses a similar mechanism. iCloud Keychain and Keychain Backup use a different but similar mechanism that relies as much as possible on your device. The rest of iCloud is far more open, but I also expect that to change over time.

Overall it’s a solid balance of convenience and security. Especially when you consider there are a billion Apple devices out there. iMessage doesn’t eliminate the need for true zero-knowledge messaging systems, but it is extremely secure, especially when you consider that it’s basically a transparent replacement for text messaging.


Friday, April 15, 2016

Summary April 14, 2016

By Rich

Rich here.

Mike, Adrian, and I are just back from a big planning session for what we are calling “Securosis 2.0”. Everything is lining up nicely, and now we mostly just need to get the website updated. We are fully gutting the current design and architecture, and moving everything into AWS. The prototyping is complete and next week I get to build out the deployment pipeline, because we are going with a completely immutable design.

One nice twist is that the public side is all read-only, while we have a totally different infrastructure for the admin side. Both share a common database (MariaDB on RDS) and file store (S3). We are estimating about a 10X cost savings compared to our current high-security hosting. As we get closer I’ll start sharing more implementation tips based on our experience. This is quite different from our Trinity platform, which is completely bespoke, whereas in this case we have to work with an existing content management system and wrangle it into a cloud native deployment.

If you want to subscribe directly to the Friday Summary only list, just click here.

Top Posts for the Week

Tool of the Week

Last week we set the stage with Jenkins and I hinted that this week we would start on some security-specific tools. It’s time to talk about Gauntlt, one of the best ways to integrate testing into your deployment pipeline. It is a must-have addition to any continuous deployment/delivery process.

Gauntlt allows you to hook your security tools into your pipeline for automated testing. For example you can define a simple test to find all open ports using nmap, then match those ports to the approved list for that particular application component/server. If it fails the test you can fail the build and send the details back to your issue tracker for the relevant developer or admin to fix. Attacks (tests) are written in an easy-to-parse format.

It’s an extremely powerful way to integrate automated security testing into the development and deployment process, using the same tools and hooks as development and operations themselves.

Securosis Blog Posts this Week

We were all out this week for our planning session, so no posts.

Other Securosis News and Quotes

Training and Events


Friday, April 08, 2016

Summary: The Great Vomit Apology

By Rich

Rich here.

I started to write an apology for this week’s Summary, because I missed last week due to an unplanned stomach bug that hit at 4am Thursday, when I normally write these. It was nearly 5 days before I fully recovered. Then I realized I had fully drafted a Summary on March 11 – an abridged version due to my daughter waking up with a stomach infection. It turns out I left that one as a draft, and never even noticed… that’s what kids do to ya.

So I’m including all my post-RSA conference links here, and adding some newer content as well. We’re building up a massive backlog of content at this point, so there’s no shortage of things to write about. And if you didn’t believe in the germ theory of infection, my home is conclusive proof.

Someone emailed asking if we could cover more cloud providers than just AWS. We tend to focus on them because they are the biggest, and that’s where most of our work is, but we are actively trying to expand coverage. Email us at info@securosis.com if you have any interesting sites we should follow, or see any interesting presentations. There are a bunch of catch-up links here, but next week I plan to focus more on Microsoft and Google.

If you want to subscribe directly to the Friday Summary only list, just click here.

Top Posts for the Week

Tool of the Week

This is a new section highlighting a cloud, DevOps, or security tool we think you should take a look at. We still struggle to keep track of all the interesting tools that can help us, so if you have submissions please email them to info@securosis.com.

This week I want to focus on a tool that is one of the cornerstones of DevOps in many organizations, but with which not all security professionals are familiar. We need this as a foundation so we can start talking about some cool security extensions next week. Thus, ladies and gentlemen, today we will talk about Jenkins.

Jenkins is the most popular continuous integration tool right now. It’s Open Source with a very active community and a ton of support and plugins. For those of you without development experience, a CI server automates integrating application code changes and running tests. It can do a lot more than that, but continuously integrating changes (even from multiple teams’ contributors in massive projects) and making sure the code still works is a big deal. What makes Jenkins so special is that large community and massive plugin support. Instead of merely integrating updated code, it can detect when code is updated in a repository, pull it and integrate, automatically stand up a test environment, run thousands of tests, send alerts back on failures, or push code into further testing or production if it passes. The current version (and upcoming version 2.0) are automation servers that can handle complex workflows and pipelines for managing application updates.

This automation offers tremendous security benefits. For example there is a full audit trail of all code changes. Better yet, you can integrate security testing into your automation pipeline, far more effectively than previous ways we’ve used security testing tools. You can flag changes to security-sensitive parts of code like encryption or authentication to require a security sign-off. All this using the same tool developers use anyway, and integrated into their processes. Jenkins isn’t just for code – you can use it for server configuration, and using a tool like Packer it can create gold images and perform automatic security scans. You can even run complex vulnerability assessments on cloud/virtual infrastructure using code templates like Vagrant, Cloudformation, or Terraform.

Next week we’ll talk about one of the coolest security testing tools that integrates with Jenkins.

Securosis Blog Posts this Week

Maximizing WAF Value

Other Securosis News and Quotes

Training and Events


Thursday, April 07, 2016

Maximizing WAF Value: Management

By Adrian Lane

As described in last post, deploying a WAF requires knowledge of both application security and your specific application(s). Management it requires an ongoing effort to keep a WAF current with emerging attacks and frequent application changes. Your organization likely adds new applications and changes network architectures at least a couple times a year. We see more and more organizations embracing continuous deployment for their applications. This means application functions and usage are constantly changing as well. So you need to adjust your defenses regularly to keep pace.

Test & Tune

The deployment process is about putting rules in place to protect applications. Managing the WAF involves monitoring it to figure out how well your rules are actually working, which requires spending a bunch of time examining logs to learn what is working and what’s not.

Tuning policies for both protection and performance is not easy. As we have mentioned, you need someone who understands the rule ‘grammars’ and how web protocols work. That person must also understand the applications, the types of data customers should have access to within them, what constitutes bad behavior/application misuse, and the risks the web applications pose to the business. An application security professional needs to balance security, applications, and business skills, because the WAF policies they write and manage touch all three disciplines.

The tuning process involves a lot of trial and error, figuring out which changes have unintended consequences like adding attack surface or breaking application functionality, and which are useful for protecting applications from emerging attacks. You need dual tuning efforts, one for positive rules which must be updated when new application functionality is introduced, and another for negative rules which protect applications against emerging attacks.

By the time a WAF is deployed customers should be comfortable creating whitelists for applications, having gained a decent handle on application functionality and leveraging the automated WAF learning capabilities. It’s fairly easy to observe these policies in monitor-only mode, but there is still a bit of nail-biting as new capabilities are rolled out. You’ll be waiting for users to exercise a function before you know if things really work, after which reviewing positive rules gets considerably easier.

Tuning and keeping negative security policies current still relies heavily on WAF vendor and third-party assistance. Most enterprises don’t have research groups studying emerging attack vectors every day. These knowledge gaps, regarding how attackers work and cutting-edge attack techniques, create challenges when writing specific blacklist policies. So you are will depend on your vendor for as long as you use WAF, which is why we stress finding a vendor who acts as a partner and building support into your contract.

As difficult as WAF management is, there is hope on the horizon, as firms embrace continuous deployment and DevOps, and accept daily updates and reconfiguration. These security teams have no choice but to build & test WAF policies as part of their delivery processes. WAF policies must be generated in tandem with new application features, which requires security and development teams to work shoulder-to-shoulder, integrating security as part of release management. New application code goes through several layers of functional testing and WAF rules get tested as code goes into a production environment, but before exposure to the general public.

This integrated release process is called Blue-Green deployment testing. In this model both current (Blue) and new (Green) application code are run, on their own servers, in parallel in a fully functional production environment, ensuring applications run as intended in their ‘real’ environment. The new code is gated at the perimeter firewall or routers, limiting access to in-house testers. This way in-house security and application teams can verify that both the application and WAF rules function effectively and efficiently. If either fails the Green deployment is rolled back and Blue continues on. If Green works it becomes the new public production copy, and Blue is retired. It’s early days for DevOps, but this approach enables daily WAF rule tuning, with immediate feedback on iterative changes. And more importantly there are no surprises when updated code goes into production behind the WAF.

WAF management is an ongoing process – especially in light of the dynamic attack space blacklists addresses, false-positive alerts which require tuning your ruleset, and application changes driving your whitelist. Your WAF management process needs to continually learn and catalog user and application behaviors, collecting metrics as part of the process. Which metrics are meaningful, and which activities you need to monitor closely, differs between customers. The only consistency is that you cannot measure success without logs and performance metrics. Reviewing what has happened over time, and integrating that knowledge into your policies, is key to success.

Machine Learning

At this point we need to bring “machine learning” into the discussion. This topic generates confusion, so let’s first discuss what it means to us. In its simplest form, machine learning is looking at application usage metrics to predict bad behavior. These algorithms examine data including stateful user sessions, user behavior, application attack heuristics, function misuse, and high error rates. Additional data sources include geolocation, IP address, and known attacker device fingerprints (IoC). The goal is to detect subtler forms of application misuse, and catch attacks quickly and accurately. Think of it as a form of 0-day detection. You want to spot behavior you know is bad, even if you haven’t seen that kind of badness before.

Machine learning is a useful technique. Detecting attacks as they occur is the ideal we strive for, and automation is critical because you cannot manually review all application activity to figure out what’s an attack. So you’ll need some level of automation, to both scale scarce resources and fit better into new continuous deployment models.

But it is still early days for this technology – this class of protection has a ways to go for maturity and effectiveness. We see varied success: some types of attacks are spotted, but false positive rates can be high. And we are not fans of the term “machine learning” for this functionality, because it’s too generic. We’ve seen some vendors misrepresent their old manual functions as “machine learning” in an attempt to put lipstick on a … well, you know.

These entirely predictable marketing shenanigans make it very difficult for customers to compare apples to apples when selecting a WAF, and it is doubly difficult for people without deep background in security and application development to know which capabilities will be useful. For a real comparison you need vendors to show you how these features make WAF policies easier and better in practice. If these capabilities help tune your WAF policies without a lot of work on your part, you’re getting huge value. If it’s just another way to look at data, and you still need to figure out how to use the data to tune your rules, it’s not worth it.

Security Analytics and Threat Intelligence

We have done a lot of research into the impact of threat intelligence on security operations. You can just Google “Securosis threat intelligence” for enough reading to keep you busy for weeks. We believe in TI because it enables you to benefit from the misfortune of others and learn from attacks being used against other organizations.

In a WAF context these services monitor “threat actors” across the globe, collecting events from thousands of organizations. These events are fed into large data warehouses and mined for application misuse patterns. This data also enables vendors to determine which ‘users’ are regularly scraping content or sending malicious requests to identify devices, domains, and IP addresses involved in attacks. Additionally, security research teams monitor adversary activity and how threat actors are evolving their tactics, techniques, and procedures (TTPs). Patterns and TTPs are usually consistent for each threat actor, so you can look for likely threat actors’ patterns in your applications.

TI offers two key offerings. The first is the source IP addresses of the people and bots involved in attacking web sites – perhaps even those actively attacking your site – which you can incorporate into a blacklist and block. This list changes continually as new devices are compromised and altered to probe and attack web sites, so the feed needs to be directly integrated into your WAF to block dynamically.

The second offering provides the types of exploits and malicious requests currently being seen on other sites, with the feed either providing attack details, or (directly from your WAF vendor) a rule to detect and block a set of attacks. These services help you evolve your rules without watching the entire Internet, or needing to employ people who understand how the attacks work and how to block them.

It’s not exactly a crystal ball, but TI gives you a heads-up about attacks which may be coming your way.

Putting it all together

Once your WAF is deployed you need a strong operational process to keep it up-to-date and current. This requires a consistent and reliable testing process to keep pace with application changes. You also can leverage more automated techniques like machine learning (internal analysis of your own applications), and threat intelligence (external analysis of attacks on other organizations), to scale your processes and increase accuracy.

We won’t pretend WAF is a “set it and forget it” technology. Work is needed before, during, and after deployment. It requires a strong underlying process to make sure you not only get a quick win to maximize short-term value, but also derive long-term sustainable benefit from your WAF investment.

—Adrian Lane

Incite 4/6/2016—Hindsight

By Mike Rothman

When things don’t go quite as you hoped, it’s human nature to look backwards and question your decisions. If you had done something different maybe the outcome would be better. If you didn’t do the other thing, maybe you’d be in a different spot. We all do it. Some more than others. It’s almost impossible to not wonder what would have been.

But you have to be careful playing Monday Morning QB. If you wallow in a situation you end up stuck in a house of pain after a decision doesn’t go well. You probably don’t have a time machine, so whatever happened is already done. All you have left is a learning opportunity to avoid making the same mistakes again.

hindsight can be painful

That is a key concept, and I work to learn from every situation. I want to have an idea of what I would do if I found myself in a similar situation again down the line. Sometimes this post-mortem is painful – especially when the decision you made or action you took was idiotic in hindsight. And I’ve certainly done my share of idiotic things through the years. The key to leveraging hindsight is not to get caught up in it. Learn from the situation and move on. Try not to beat yourself up over and over again about what happened. This is easy to say and very hard to do. So here is how I make sure I don’t get stuck after something doesn’t exactly meet my expectations.

  1. Be Objective: You may be responsible for what happened. If you are, own it. Don’t point fingers. Understand exactly what happened and what your actions did to contribute to the eventual outcome. Also understand that some things were going to end badly regardless of what you did, so accept that as well.
  2. Speculate on what could be different: Next take some time to think about how different actions could have produced different outcomes. You can’t be absolutely sure that a different action would work out better, but you can certainly come up with a couple scenarios and determine what you want to do if you are in that situation again. It’s like a game where you can choose different paths.
  3. Understand you’ll be wrong: Understand that even if you evaluate 10 different options for a scenario, next time around there will be something you can’t anticipate. Understand that you are dealing with speculation, and that’s always dicey.
  4. Don’t judge yourself: At this point you have done what you can do. You owned your part in however the situation ended up. You figured out what you’ll do differently next time. It’s over, so let it go and move forward. You learned what you needed, and that’s all you can ask for.

That’s really the point. Fixating on what’s already happened closes off future potential. If you are always looking behind you, you can neither appreciate nor take advantage of what’s ahead. This was a hard lesson for me. I did the same stuff for years, and was confused because nothing changed. It took me a long time to figure out what needed to change, which of course turned out to be me.

But it wasn’t wasted time. I’m grateful for all my experiences, especially the challenges. I’ve had plenty of opportunities to learn, and will continue to screw things up and learn more. I know myself much better now and understand that I need to keep moving forward. So that’s what I do. Every single day.


Photo credit: “Hindsight” from The.Rohit

Security is changing. So is Securosis. Check out Rich’s post on how we are evolving our business.

We’ve published this year’s Securosis Guide to the RSA Conference. It’s our take on the key themes of this year’s conference (which is really a proxy for the industry), as well as deep dives on cloud security, threat protection, and data security. And there is a ton of meme goodness… Check out the blog post or download the guide directly (PDF).

The fine folks at the RSA Conference posted the talk Jennifer Minella and I did on mindfulness at the 2014 conference. You can check it out on YouTube. Take an hour. Your emails, alerts, and Twitter timeline will be there when you get back.

Securosis Firestarter

Have you checked out our video podcast? Rich, Adrian, and Mike get into a Google Hangout and… hang out. We talk a bit about security as well. We try to keep these to 15 minutes or less, and usually fail.

Heavy Research

We are back at work on a variety of blog series, so here is a list of the research currently underway. Remember you can get our Heavy Feed via RSS, with our content in all its unabridged glory. And you can get all our research papers too.

Resilient Cloud Network Architectures

  • [Design Patterns]
  • [Fundamentals]

Shadow Devices

Building a Vendor IT Risk Management Program

SIEM Kung Fu

Recently Published Papers

Incite 4 U

  1. Still no free lunch, even if it’s fake: Troy Hunt’s post is awesome, digging into how slimy free websites gather personal information and then sell it. But it turns out all that glitters isn’t gold, and some of those records are total crap. It’s very interesting to see how Troy pulled a number of strings to figure out which sites were responsible, and then figured out that a lot of their data is fake. Which makes sense given that no one can really check 5MM records, so they confirm a small sample and then pad with nonsense. Now you can’t even believe the fraudsters that are selling records to perpetrate more fraud. That’s totally shocking! – MR

  2. The flaw: I’ve never been a fan, but since I have neither deep experience nor data to show whether the Bi-modal IT model is good or bad, I have shied away from commenting on it. Jez Humble does not mince words, and his recent analysis of Gartner’s Bi-modal model for IT practices is no exception. He states that Gartner’s model is based on the idea that reliability and agility are at odds. I think frameworks like this were created by philosophers who grasp a specific problem, but lack the practical experience to understand why idealistic solutions don’t work in the real world. But when you examine a confounding, anti-intuitive approach like DevOps on paper, it looks reckless. In practice it has shown that speed and quality can be synergistic. Much the way Toyota demonstrated how their manufacturing approach allowed them to make cars better, faster, and cheaper; DevOps is blowing up assumptions of what’s possible with IT operations and software delivery. A few years down the line, we won’t call it DevOps – it will just be how we do things. – AL

  3. Smokin’ hot job: Based on the latest ComputerWorld Salary Survey, security folks are in the catbird seat. Information Security Manager is the hottest job in IT. No kidding. The survey states “security pros are paid well, rate job satisfaction high, and will make a move for money.” No kidding. Salaries were up 6% or so, and it’s probably more in metro areas with high demand and lots of options for practitioners. Though I can’t say job satisfaction is a highlight in the (very non-statistically significant) sample of folks I talk to regularly. It’s a tough job, and it’s worth a lot. So once again Mr. Market wins. And if anything salaries will continue to move upward because it’s not like a bunch of security personnel are about to come online anytime soon, if ever. – MR

  4. Subtle: Google’s announcement last week on encryption and email security nearly put me to sleep with their understated blog post and crappy “Safer Internet Day” tag. Let’s be honest: “Safe Browsing” has been around for years, and has allowed me to go to dangerous sites without protest. But not those sites. I don’t do that (doh). Flagging sources already using TLS is handy, but few use content encryption, so it’s hard to say that content is not being read by some government entity which subpoenaed an Internet provider somewhere. But what got my attention was the new warning about “state sponsored attacks”. When you realize the route your email takes to get to its destination, you realize this is agnostically targeting your government, regardless of where you sit. Very subtly backing the recent swell of support for user privacy and encryption. – AL

  5. WhatsApp’s one-finger salute: Whereas Google has taken a more subtle approach to telling the feds who want full access to customer data to bugger off, Facebook’s WhatsApp has basically stood on the proverbial table to give them the one-finger salute. By integrating the Signal protocol into their app and stating in no uncertain terms that all messaging traffic is encrypted from device to device, Facebook is making it clear that they cannot access user content, regardless of subpoenas. Although law enforcement can still access who spoke to whom, they cannot get anything else. We have written quite a bit about privacy and the slippery slope of backdoors and special operating systems. I applaud Facebook for taking a stand here, knowing that bad folks can (and do) use their network to plan bad things. But either there is monitoring or there isn’t, and by not providing any wiggle room, WhatsApp has clearly decided against monitoring. But you have to wonder if Facebook is forgoing a huge advertising stream by providing truly private messaging. – MR

—Mike Rothman

Wednesday, April 06, 2016

Maximizing WAF Value: Deployment

By Adrian Lane

Now we will dig into the myriad ways to deploy a Web Application Firewall (WAF), including where to position it and the pros & cons of on-premise devices versus WAF services. A key part of the deployment process is training the WAF for specific applications and setting up the initial rulesets. We will also highlight effective practices for moving from visibility (getting alerts) to control (blocking attacks). Finally we will present a Quick Wins scenario because it’s critical for any security technology to get a ‘win’ early in deployment to prove its value.

Deployment Models

The first major challenge for anyone using a WAF is getting it set up and effectively protecting applications. Your process will start with deciding where you want the WAF to work: on-premise, cloud-hosted, or a combination. On-premise means installing multiple appliances or virtual instances to balance incoming traffic and ensure they don’t degrade the user experience. With cloud services you have the option of scaling up or down with traffic as needed. We’ll go into benefits and tradeoffs of each later in this series.

Next you will need to determine how you want the WAF to work. You may choose either inline or out-of-band. Inline entails installing the WAF “in front of” a web app so all traffic from and to the app runs through it. This blocks attacks directly as they come in, and in some cases before content is returned to users. Both on-premise WAF devices and cloud WAF services provide this option. Alternatively, some vendors offer an out-of-band option to assess application traffic via a network tap or spanning port. They use indirect methods (TCP resets, network device integration, etc.) to shut down attack sessions. This approach has no side-effects on application operation, because traffic still flows directly to the app.

Obviously there are both advantages and disadvantages to having a WAF inline, and we don’t judge folks who opt for out-of-band rather than risking the application impact of inline deployment. But out-of-band enforcement can be evaded via tactics like command injection, SQL injection, and stored cross-site scripting (XSS) attacks that don’t require responses from the application. Another issue with out-of-band deployment is that attacks can make it through to applications, which puts them at risk. It’s not always a clear-cut choice, but balancing risks is why you get paid the big bucks, right?

When possible we recommend inline deployment, because this model gives you flexibility to enforce as many or as few blocking rules as you want. You need to carefully avoid blocking legitimate traffic to your applications. Out-of-band deployment offers few reliable blocking options.

Rule Creation

Once the device is deployed you need to figure out what rules you’ll run on it. Rules embody what you choose to block, and what you let pass through to applications. The creation and maintenance of these rules where is you will spend the vast majority of your time, so we will spend quite a bit of time on it. The first step in rule creation is understanding how rules are built and employed. The two major categories are negative and positive security rules: the former are geared toward blocking known attacks, and the latter toward listing acceptable actions for each application. Let’s go into why each is important.

Negative Security

“Negative Security” policies essentially block known attacks. The model works by detecting patterns of known malicious behavior, or ‘signatures’. Things like content scraping, injection attacks, XML attacks, cross-site request forgeries, suspected botnets, Tor nodes, and even blog spam, are universal application attacks that affect all sites. Most negative policies come “out of the box” from vendors’ internal teams, who research and develop signatures for customers.

Each signature explicitly describes one attack or several variants, these rules typically detect SQL injection and buffer overflows. The downside of this method is its fragility: the signature will fail to match any unrecognized variations, and will thus bypass the WAF. If you think “this sounds like traditional endpoint AV” you’re right. So signatures are only suitable when you can reliably and deterministically describe attacks, and don’t expect signatures to be immediately bypassed by simple evasion.

WAFs usually provide a myriad of other detection options to compensate for the limitations of static signatures: heuristics, reputation scoring, detection of evasion techniques, and proprietary methods for qualitatively detecting attacks. Each method has its own strengths and weaknesses, and use cases for which it is better or worse suited. These techniques can be combined to provide a risk score for incoming requests, and with flexible blocking options based on the severity of the attack or your confidence level. This is similar to the “spam cocktail” approach used by email security gateways for years. But the devil is in the details, there are thousands of attack variations, and figuring out how to apply policies to detect and stop attacks is difficult.

Finally there are rules you’ll need specifically to protect your web applications from a class of attacks designed to find flaws in the way application developers code, targeting gaps in how they enforce process and transaction state. These include rules to detect fraud, business logic attacks, content scraping, and data leakage, which cannot be detected using generic signatures or heuristics. Examples of these kinds of attacks include issuing order and cancellation requests in rapid succession to confuse the web server or database into revealing or altering shopping cart information, replaying legitimate transactions, and changing the order of events to attack transaction integrity.

These application-specific rules are constructed using the same analytic techniques, but rather than focusing on the structure and use of HTTP and XML grammars, a fraud detection policy examines user behavior as it relates to the type of transaction being performed. These policies require a detailed understanding of both how attacks work and how your web applications work.

Positive Security

The other side of this coin is the positive security model, called ‘whitelisting.’ Positive security only allows known and authorized web requests, and blocks all others. Old-school network security professionals recall the term “default deny”. This is the web application analogue. It works by observing and cataloging legitimate application traffic, establishing ‘good’ requests as a baseline for acceptable usage, and blocking everything else. You’ll need to ensure you do not include any attacks in your ‘clean’ baseline, and to set up policies to block anything not on your list of valid behaviors.

The good news is that this approach is very effective at catching malicious requests you have never seen before (0-day attacks) without having to explicitly code signatures for each potential attack. You understand the folly of trying to manage a rule set to detect every possible attack. This is also an excellent way to pare down the universe of all threats described above into a smaller and more manageable subset of attacks to include in a blacklist. For example negative policies can restrict HTTP requests to known valid actions.

The bad news is that applications are dynamic and change regularly, so unless you update your whitelist with every application update, your WAF will effectively disable new application features or crash applications. Yet for those willing to do the work positive security is a huge win. Understand that this approach is becoming more complicated as continuous deployment, DevOps, and code trickery such as ‘feature tagging’ all ratchet up the cadence of WAF rule updates. Some organizations have moved testing of WAF rules inside their development pipelines to ensure WAF doesn’t break new functionality.

You will use both positive and negative approaches in tandem because neither approach alone can adequately protect applications.

Establishing Rules

Once the WAF is installed it’s time to get basic rules in place. You will start with the WAF in monitor-only mode (also known as alert mode) until your rules are set up and vetted. This involves three steps:

  1. Detect attacks: First turn on any built-in (negative) rules to detect known bad behavior. They should be part of the vendor’s basic bundle.
  2. Learning mode: Next the WAF automatically learns web traffic to help build policies, saving you time and effort. Most WAF platforms include this as basic functionality.
  3. Tuning: You will need to operate in this mode from a few days to weeks, depending upon applications and traffic levels, to generate a decent known-good baseline.

Let’s dig into learning mode. You start with discovery. By looking at traffic logs to see what the pre-packaged rules would have blocked if they were enabled, you learn what your applications actually do. This provides you a proverbial smorgasbord of application activities, from which you identify what needs to be secured, and which positive rules are appropriate to permit. This initial discovery process is essential to ensure your initial ruleset covers what each application really does, not just what you think it does.

After this initial learning process you are ready to go through your first round of tuning, to get rid of some false positives and (if you were testing using actual attacks) false negatives. You will need to tweak WAF rules, and possibly add new rules, to ensure your security policies comprehensively protect your applications. Once you understand our application functions and have a good idea of what attacks to expect, you need to determine how you will counter them. For example if your application has a known defect which provides a security vulnerability you cannot address in a timely fashion through code changes, WAF provides several options:

  • Block the request: You can remove the offending function from your whitelist, stopping the threat if your deployment supports blocking. The downside is that removing an app function or service this way may break part of the application.
  • Create a signature: You can write a specific signature to detect attacks on that defect so you can detect attempts to exploit it. This will stop known attacks, but you must account for all possible variations and evasion techniques.
  • Use heuristics: You can use one or more heuristics as clues to abnormal application use or an attack on a vulnerability. Heuristics include malformed requests, odd customer geolocations, customer IP reputation, use of known weak or insecure application areas, requests for sensitive data, and various other attack indicators.

With your traffic baseline you can enable whitelisting to provide positive security. With your rules tuned and whitelisting enabled – and confidence in your results so far – it’s time to enable blocking. You will need to be present and alert when you do this, because you are certain to miss something, so expect another small round of tuning here. We will explore day-to-day WAF management in our next post.

Quick Wins

Deploying a WAF for the first time is a very difficult job, more art than science. It’s also high profile because the affected applications are usually high-profile; so you will face scrutiny from the CISO, developers, and the IT team. Improved security is much harder to demonstrate or observe than a broken application or bad performance, which are visible to everyone including customers. We have a few tips to get you up and running quickly and safely, so you can demonstrate positive momentum to your boss.

  • Start with learning mode: As mentioned above, learning mode can dramatically accelerate building your initial ruleset. It used to be very crude, but over the years this capability has matured. In some cases we have seen valid whitelists created in under 24 hours.
  • Leverage threat intel: It’s a big world out there, and odds are the attack you saw last week hit someone else before that. Global threat intelligence, threat feeds, IP reputation services, and the like can all help you identify attacks – even when you haven’t seen them before. And in many cases threat intelligence can be automatically integrated into your existing ruleset, to improve protection with minimal effort. We have seen great success with threat intel because it’s easy to employ and quickly blocks the basic DDoS, bots, and malware.
  • Feed your other security systems: WAFs output event data in several formats, including syslog – which can be ingested by just about any SIEM, log management tool, or analytics repository. These feeds provide another source of security data to supplement your security monitoring efforts, and help compliance teams substantiate the controls in place to meet compliance mandates.
  • Use vendor services where appropriate: Given the lack of sufficient security talent most firms have trouble finding people to manage their WAF. Most WAF admins, like other security folks, have other responsibilities, but WAF requires considerable specialization to operate effectively. We recommend leaning on your WAF vendor for services early in the process, especially during Proof of Concept (PoC) testing before purchase. They are far more familiar with their products than you are, and can help steer you past common pitfalls. Some even offer monitoring services to watch your WAF in action, especially those which offer cloud-based WAF services. They essentially help you tune your protection by detecting false positives and negatives. You should try to bundle in additional services when negotiating fees or purchasing, as expert help can go a long way toward getting your WAF up and useful quickly.

Our next post will talk about daily WAF operation. We’ll discuss rule management, being as agile as application teams, and using threat feeds effectively; we’ll also offer some perspective on machine learning and advanced WAF functions.

—Adrian Lane

Thursday, March 31, 2016

Maximizing Value From Your WAF [New Series]

By Adrian Lane

Web Application Firewalls (WAFs) have been in production use for well over a decade, maturing from point solutions primarily blocking SQL injection to mature application security tools. In most mature security product categories, such as anti-virus, there hasn’t been much to talk about, aside from complaining that not much has changed over the last decade. WAFs are different: they have continued to evolve in response to new threats, new deployment models, and a more demanding clientele’s need to solve more complicated problems. From SQL injection to cross-site scripting (XSS), from PCI compliance to DDoS protection, and from cross-site request forgeries (CSRF) to 0-day protection, WAFs have continued add capabilities to address emerging use cases. But WAF’s greatest evolution has taken place in areas undergoing heavy disruption, notably cloud computing and threat analytics.

WAFs are back at the top of our research agenda, because users continue to struggle with managing WAF platforms as threats continue to evolve. The first challenge has been that attacks targeting the application layer require more than simple analysis of individual HTTP requests – they demand systemic analysis of the entire web application session. Detection of typical modern attack vectors including automated bots, content scraping, fraud, and other types of misuse, all require more information and deeper analysis. Second, as the larger IT industry flails to find security talent to manage WAFs, customers struggle to keep existing devices up and running; they have no choice but to emphasize ease of set-up and management during product selection.

So we are updating our WAF research. This brief series will discuss the continuing need for Web Application Firewall technologies, and address the ongoing struggles of organizations to run WAFs. We will also focus on decreasing the time to value for WAF, by updating our recommendations for standing up a WAF for the first time, discussing what it takes to get a basic set of policies up and running, and covering the new capabilities and challenges customers face.

WAF’s Continued Popularity

The reasons WAF emerged in the first place, and still one of the most common reason customers use it, is that no other product really provides protection at the application layer. Cross-site scripting, request forgeries, SQL injection, and many common attacks which specifically target application stacks tend to go undetected. Intrusion Detection Systems (IDS) and general-purpose network firewalls are poorly suited to protecting the application layer, and remain largely ineffective for that use case. In order to detect application misuse and fraud, a security solution must understand the dialogue between application and end user. WAFs were designed for this need, to understand application protocols so they can identify applications under attack. For most organizations, WAF is still the only way get some measure of protection for applications.

For many years sales of WAFs were driven by compliance, specifically a mandate from the Payment Card Industry’s Data Security Standard (PCI-DSS). This standard gave firms the option to either build security into their application (very hard), or protect them with WAF (easier). The validation requirements for WAF deployments are far less rigorous than for secure code development, so most companies opted for WAF. Shocking! You basically plug one in and get a certification. WAF has long been the fastest and most cost-effective way to satisfy Requirement 6 of the PCI-DSS standard, but it turns out there is long-term value as well. Users now realize that leveraging a WAF is both faster and cheaper than fixing bug-ridden legacy applications. The need has morphed from “get compliant fast!” to “secure legacy apps for less!”

WAF Limitations

The value of WAF is muted by difficulties in deployment and ongoing platform management. A tool cannot provide sustainable value if it cannot be effectively deployed and managed. The last thing organizations need is yet another piece of software sitting on a shelf. Or even worse an out-of-date WAF providing a false sense of security.

Our research highlighted the following issues which contribute to insecure WAF implementations, allowing penetration testers and attackers to easily evade WAF and target applications directly.

  • Ineffective Policies: Most firms complain about maintaining WAF policies. Some complaints are about policies falling behind new application features, and policies which fail to keep pace with emerging threats. Equally troubling is a lack of information on which policies are effective, so security professionals are flying blind. Better metrics and analytics are needed to tell users what’s working and how to improve.
  • Breaking Apps: Security policies – the rules that determine what a WAF blocks and what passes through to applications – can and do sometimes block legitimate traffic. Web application developers are incentivized to push new code as often as possible. Code changes and new functionality often violate existing policies, so unless someone updates the whitelist of approved application requests for every application change, a WAF will block legitimate requests. Predictably, this pisses off customers and operational folks alike. Firms trying to “ratchet up” security by tightening policies may also break applications, or generate too many false positives for the SoC to handle, leading to legitimate attacks going ignored and unaddressed in a flood of irrelevant alerts.
  • Skills Gap: As we all know, application security is non-trivial. The skills to understand spoofing, fraud, non-repudiation, denial of service attacks, and application misuse within the context of an application are rarely all found in any one individual. But they are all needed to be an effective WAF administrator. Many firms – especially those in retail – complain that “we are not in the business of security” – they want to outsource WAF management to someone with the necessary skills. Still others find their WAF in purgatory after their administrator is offered more money, leaving behind no-one who understands the policies. But outsourcing is no panacea – even a third-party service provider needs the configuration to be somewhat stable and burned-in before they can accept managerial responsibility. Without in-house talent for configuration you are hiring professional services teams to get up and running, and then scrambling to find budget for this unplanned expense.
  • Cloud Deployments: Your on-premise applications are covered by WAFs and you’re reasonably happy with your team’s technical proficiency, but as your company moves applications into public cloud infrastructure (and they are, whether you know about them or not), you may be unable to migrate your WAF and associated policies to this new and fundamentally different architecture. Your WAF vendor may not provide a suitable substitute for the appliance you run on-premise, or it might not offer the APIs you need to orchestrate WAF deployment in your dynamic cloud environment, where applications and servers come and go much faster than ever before.

Going It Alone

If all those problems sound like a nightmare, don’t worry – you’ll still be able to sleep tonight. The nightmare scenario is actually not using some type of WAF or filter to protect your applications, and instead leaving them wide open to attack. Sure, a few of you have been fortunate enough to build your applications from scratch with security in mind, which is awesome. But those unicorns are rare – the rest of us are relegated to fixing vulnerabilities in existing applications… you know, the ones designed and coded without any input from security.

In most software stacks current and future defects are discovered in the underlying platform, third-party software, and in-house applications. That’s just software. Of course attackers know this, so they probe applications for new vulnerabilities. Legacy platforms will take years, if not decades, to meet modern security requirements. And the cost of these efforts is inevitably many times greater the original investment. In many cases it is simply not economically feasible to fix the application, so some other approach must be used – which is where WAF offers tremendous value.

Our research shows that WAF failures far more often result from operational failure than from fundamental product flaws. Make no mistake – WAFs are not a silver bullet – but a correctly deployed WAF makes it much harder to successfully attack an application, and for an attacker to avoid detection. The effectiveness of WAFs is directly related to the quality of people and processes used to maintain them. The most serious problems with WAF are more issues with management and operational processes than technology. We need a pragmatic process to manage Web Application Firewalls to overcome the issues which plague the technology.

Our next post will offer advice on how to deploy WAFs and look at some recent feature enhancements which address the ease of use and effectiveness issues mentioned above. We will also highlight some Quick Wins for new setups, because any new security technology needs an early win to prove the value of an approach and the effectiveness of a technology.

—Adrian Lane

Wednesday, March 30, 2016

Incite 3/30/2016: Rational People Disagree

By Mike Rothman

It’s definitely a presidential election year here in the US. My Twitter and Facebook feeds are overwhelmed with links about what this politician said and who that one offended. We get to learn how a 70-year old politician got arrested in his 20s and why that matters now. You also get to understand that there are a lot of different perspectives, many of which make absolutely no sense to you. Confirmation bias kicks into high gear, because when you see something you don’t agree with, you instinctively ignore it, or have a million reasons why dead wrong. I know mine does.

Some of my friends frequently share news about their chosen candidates, and even more link to critical information about the enemy. I’m not sure whether they do this to make themselves feel good, to commiserate with people who think just like them, or in an effort to influence folks who don’t. I have to say this can be remarkably irritating because nothing any of these people posts is going to sway my fundamental beliefs.


That got me thinking about one of my rules for dealing with people. I don’t talk about religion or politics. Unless I’m asked. And depending on the person I might not engage even if asked. Simply because nothing I say is going to change someone’s story regarding either of those two third rails of friendship. I will admit to scratching my head at some of the stuff people I know post to social media. I wonder if they really believe that stuff, or they are just trolling everyone.

But at the end of the day, everyone is entitled to their opinion, and it’s not my place to tell them their opinion is idiotic. Even if to it is. I try very hard not to judge people based on their stories and beliefs. They have different experiences and priorities than me, and that results in different viewpoints. But not judging gets pretty hard between March and November every 4 years. At least 4 or 5 times a day I click the unfollow link when something particularly offensive (to me) shows up in my feed.

But I don’t hit the button to actually unfollow someone. I use the fact that I was triggered by someone as an opportunity to pause and reflect on why that specific headline, post, link, or opinion bothers me so much. Most of the time it’s just exhaustion. If I see one more thing about a huge fence or bringing manufacturing jobs back to the US, I’m going to scream. I get these are real issues which warrant discussion. But in a world with a 24/7 media cycle, the discussion never ends.

I’m not close-minded, although it may seem that way. I’m certainly open to listening to other candidates’ views, mostly to understand the other side of the discussion and continually refine and confirm my own positions. But I have some fundamental beliefs that will not change. And no, I’m not going to share them here (that third rail again!). I know that rational people can disagree, and that doesn’t mean I don’t respect them, or that I don’t want to work together or hang out and drink beer. It just means I don’t want to talk about religion or politics.


Photo credit: “Laugh-Out-Loud Cats #2204” from Ape Lad

Security is changing. So is Securosis. Check out Rich’s post on how we are evolving our business.

We’ve published this year’s Securosis Guide to the RSA Conference. It’s our take on the key themes of this year’s conference (which is really a proxy for the industry), as well as deep dives on cloud security, threat protection, and data security. And there is a ton of meme goodness… Check out the blog post or download the guide directly (PDF).

The fine folks at the RSA Conference posted the talk Jennifer Minella and I did on mindfulness at the 2014 conference. You can check it out on YouTube. Take an hour. Your emails, alerts, and Twitter timeline will be there when you get back.

Securosis Firestarter

Have you checked out our video podcast? Rich, Adrian, and Mike get into a Google Hangout and… hang out. We talk a bit about security as well. We try to keep these to 15 minutes or less, and usually fail.

Heavy Research

We are back at work on a variety of blog series, so here is a list of the research currently underway. Remember you can get our Heavy Feed via RSS, with our content in all its unabridged glory. And you can get all our research papers too.

Resilient Cloud Network Architectures

Shadow Devices

Building a Vendor IT Risk Management Program

Securing Hadoop

SIEM Kung Fu

Recently Published Papers

Incite 4 U

  1. That depends on your definition of consolidation: Stiennon busts out his trusty spreadsheet of security companies and concludes that the IT security industry is not consolidating. He has numbers. Numbers! That prove there is a steadlily increasing number of companies ‘selling’ security. I guess that is one way to look at it. I think it’s a pretty myopic way of assessing an industry, but hey, what do I know? Here’s a fact. Of the 1,400+ companies on Stiennon’s list, how many are actually selling anything (above $10MM in sales, or even $1MM for that matter?)? How many will be around in 18 months? 12 months? Does it even matter how many companies are in the space? I say no, because here’s what I hear all the time. Security pros want to deal with fewer vendors. Period. It turns out that Mr. Market is not wrong over the long term. There will be a shake-out in the industry, and it will begin soon. Maybe the total number of companies will continue to increase. Evidently there is an infinite number of crappy, undifferentiated ideas for security companies. The real question is what happens to share of wallet, and I’m confident that buyers will be consolidating their spending. – MR

  2. Which way do I go? When it comes to threat analytics, there are many, many services out there. For firms wishing to mine their own data, there are lots of technologies to parse dissimilar data types, and many platforms to do “big data analytics”. But at Dark Reading Kelly Jackson Higgins points out that Threat Intelligence has a Big Data Problem as firms are getting too much of a good thing. Enterprise security’s problem is not a lack of data, analytics tools, or threat feeds. Nor is it a question of whether in-house SoC is better than external services. It’s not even about false negatives or even false positives per se – rather the key questions are “Which analyzed data should I pay attention to?” and “How do I really prioritize?” Having 11 threat feeds is at best information overload, and third party ‘criticality’ rankings are often out of touch with the real risks a company faces. It classic analysis paralysis, as firms figure out which reports are meaningful, map them to risks, and then figure out what they can address in… let’s call it an “enterprise time frame”: the coming year. Having the analysis is the first step, but making it useful is still in the works. We’ll get back to you on dealing with the problems. – AL

  3. Sometimes it is better to be lucky than good… And that is the story of my career, and most of the folks I know. I really liked this coverage on the RSA Conference blog of the Centrify CEO’s talk on how his company was targeted by financial fraud and it almost worked. You have heard this story a hundred times. Email comes from the CEO to accounting to initiate a funds transfer. The funds are moved, and then it becomes apparent that the request was fraudulent. Thankfully Centrify had a policy that required multiple approvals on substantial wire transfers, otherwise the money would be gone. Kudos for sharing. And this is a good reminder that separation of duties and multiple approvals are good things. – MR

  4. Big data, small adoption: O’Reilly published Spiderbook’s research on the size of The Big Data Market, conducted by data mining public press releases, forums, job postings, and the like. And there is some interesting data in the survey: results suggest that only a small percentage of companies in the world use Hadoop, and many that do struggle due to a shortage of technical talent. Does anyone not lack technical talent today? Spiderbook went on to say that outside financial services adoption is low, but they found that large firms are much more likely to embrace big data than small ones. To which I say “Duh!” – large enterprises with huge amounts of interesting data went looking for something that could provide analytics at scale without costing millions of dollars. Our research shows large enterprises have a better than 50% Hadoop adoption rate, but it is only being used for select new projects. I maintain Spiderbook’s adoption numbers are likely on the low side; both because organizations are not particularly open about their use of Hadoop given the ‘skunkworks’ nature of many projects, and because attendance at industry trade shows for the major commercial Hadoop vendors suggests more companies are actively running it – or too much idiotic VC money being burned on campfires. Which is our way of saying you probably have Hadoop in your company, and if you’re in IT security, you’ll be dealing with a cluster full of sensitive data sooner rather than later. Get ready. – AL

  5. Is anything good or bad? I generally refute the premise of this recent NetworkWorld article: Is DevOps good or bad for security? The reality is that some aspects of DevOps (just like anything else) are ‘good’ while others are ‘bad.’ Getting poked in the eye is ‘bad’, right? Unless it’s an optometrist removing fixing cataract. Then it’s good. DevOps clearly put pressure on security teams. That may be bad because there is a distinct lack of skills. But it’s good because it forces security teams to embrace automation, and insert security into development and deployment processes. That article is basically a lovefest, talking about the benefits of DevOps, but if you do it wrong it’s a security train wreck. As with most things, there are no absolutes. Wnd we believe in the long-term value of DevOps (we’re betting the company on it), but we aren’t naive – we know that there are challenges. – MR

  6. BONUS: Generalissimo Franco is still dead! This breaking news just in: Oracle stuns the industry by moving ‘The Cloud’, big data and DevOps – all of it – into a single machine. News at 11. – AL

—Mike Rothman

Tuesday, March 29, 2016

Resilient Cloud Network Architectures: Design Patterns

By Mike Rothman

We introduced resilient cloud networks in this series’ first post. We define them as networks using cloud-specific features to provide both stronger security and higher availability for your applications. This post will dig into two different design patterns, and show how cloud networking enables higher resilience.

Network Segregation by Default

Before we dive into design patterns let’s make sure we are all clear on using network segmentation to improve your security posture, as discussed in our first post. We know segmentation isn’t novel, but it is still difficult in a traditional data center. Infrastructure running different applications gets intermingled, just to efficiently use existing hardware. Even in a totally virtualized data center, segmentation requires significant overhead and management to keep all applications logically isolated – which makes it rare.

What is the downside of not segmenting properly? It offers adversaries a clear path to your most important stuff. They can compromise one application and then move deeper into your environment, accessing resources not associated with the application stack they first compromised. So if they bust one application, there is a high likelihood they’ll end up with free rein over everything in the data center.

The cloud is different. Each server in a cloud environment is associated with a security group, which defines with very fine granularity which other devices it can communicate with, and over what protocols. This effectively enables you to contain an adversary’s ability to move within your environment, even after compromising a server or application. This concept is often called limiting blast radius. So if one part of your cloud environment goes boom, the rest of your infrastructure is unaffected.

This is a key concept in cloud network architecture, highlighted in the design patterns below.

PaaS Air Gap

To demonstrate a more secure cloud network architecture, consider an Internet-facing application with both web server and application server tiers. Due to the nature of the application, communications between the two layers are through message queues and notifications, so the web servers don’t need to communicate directly with each other. The application server tier connects to the database (a Platform as a Service offering from the cloud provider). The application server tier also communicates with a traditional data center to access other internal corporate data outside the cloud environment.

An application must be architected for the get-go to support this design. You aren’t going to redeploy your 20-year-old legacy general ledger application to this design. But if you are architecting a new application, or can rearchitect existing applications, and want total isolation between environments, this is one way to do it. Let’s describe the design.

PaaS Air Gap

Network Security Groups

The key security control typically used in this architecture is a Network Security Group, allowing access to the app servers only from the web servers, and only to the specific port and protocol required. This isolation limits blast radius. To be clear, the NSG is applied individually to each instance – not to subnets. This avoids a flat network, where all instances within a subnet have unrestricted access to all subnet peers.

PaaS Services

In this application you wouldn’t open access from the web server NSG to the app server NSG, because the architecture doesn’t require direct communication between web servers and app servers. Instead the cloud provider offers a message queue platform and notification service which provide asynchronous communication between the web and application tiers. So even if the web servers are compromised, the app servers are not accessible.

Further isolation is offered by a PaaS database, also offered by the cloud service provider. You can restrict requests to the PaaS DB to specific Network Security Groups. This ensures only the right instances can request information from the database service, and all requests are authorized.

Connection to the Data Center

The application may require data from the data center, so the app servers have access to the needed data through a VPN. You route all traffic to the data center through this inspection and control point. Typically it’s better not to route cloud traffic through inspection bottlenecks, but in this design pattern it’s not a big deal, because the traffic needs to pass over a specific egress connection to the data center, so you might as well inspect there as well.

You ensure ingress traffic over that connection can only go to the app server security group. This ensures that an adversary who compromises your network cannot access your whole cloud network by bouncing through your data center.

Advantages of This Design

  • Isolation between Web and App Servers: By putting the auto-scaling groups in a Network Security Groups, you restrict their access to everything.

  • No Direct Connection: In this design pattern you can block direct traffic to the application servers from anywhere but the VPN. Intra-application traffic is asynchronous via the message queue and notification service, so isolation is complete.

  • PaaS Service: This architecture uses cloud provider services, with strong built-in security and resilience. Cloud providers understand that security and availability are core to their business.

What’s next for this kind of architecture? To advance this architecture you could deploy mirrors of the application in different zones within a region to limit the blast radius in case one device is compromised, and to provide additional resiliency in case of a zone failure.

Additionally, if you use immutable servers within each auto-scale group, you can update/patch/reconfigure instances automatically by changing the master image and having auto-scaling replace the old instances with new ones. This limits configuration drift and adversary persistence.

Multi-Region Web Site

This architecture was designed to deploy a website in multiple regions, with availability as close to 100% as possible. This design is far more expensive than even running in multiple zones within a single region, because you need to pay for network traffic between regions (compared to free intra-region traffic); but if uptime is essential to your business, this architecture improves resiliency.

Availability Architecture

This is an externally facing application so you run traffic through a cloud WAF to get rid of obvious attack traffic. Inbound sessions can be routed intelligently to either region via the DNS service. So you can route based on server utilization, network traffic, geography, or a variety of other criteria. This is a great example of software-defined security, programming traffic distribution within cloud stacks.

  • Network Security Groups: In this design pattern you implement Network Security Groups to lock down traffic into the app servers. That isn’t shown specifically because it would greatly complicate the diagram. But Network Security Groups for the web and app servers should be part of this architecture.
  • Compute Layer: Application and web servers are in auto-scale groups within each region. The load balancer distributes the sessions between the regions intelligently as well to ensure the most effective usage of web site.
  • Database Layer: If this was just a multi-zone deployment you wouldn’t need to worry about database replication, as that capability is built into PaaS databases. But that is not the case here. You are operating in multiple regions, so you need to replicate your databases. It is like having two separate data centers. We cannot tackle the network architecture to support database replication here, because that would also overcomplicate the architecture. We just need to point out another way operating in multiple regions adds complexity.
  • Static Files: This website of course includes a variety of static files, so you need to figure out how to keep the file stores in sync between regions as well. Using Network Security Groups, you can lock access to the storage buckets down to specific groups or instances. That’s a good way to make sure you don’t get malware files loaded up onto your website. Cross-region replication is a service your cloud provider may offer so you don’t need to build it yourself.

Advantages of This Design

This architecture is primarily intended to show how the cloud provides enables you to easily establish an application in multiple regions, similar to multiple data centers. So what’s unique about the cloud? You can take the entire stack in Region A and copy it to Region B with a few clicks in your cloud console. Of course you’d have some networks to reconfigure, but in most ways the environment is identical. Auto-scale groups work off the same images, so the operational overhead of supporting multiple cloud regions is drastically lower than operating across multiple data centers.

There are also significant availability advantages. In case a region goes down, the DNS service will detect that automatically and route all new incoming sessions to the available region. Existing sessions in the region will be lost, so there is some collateral damage, but that happens any time you lose a data center. Those sessions can be re-established to the available region. When the region recovers DNS will automatically start sending new sessions to it. This is all transparent to the application and users.

You can also provide a central spot for both static files and logs by using cross-region replication. This capability is currently specific to Amazon Web Services, but we expect it to be a critical feature on all cloud infrastructure platforms at some point. As is increasingly the case in the cloud, services which previously required extreme planning and/or additional products to manage, are now built-in platform features.

That’s a good note on which to wrap up this quick series. The cloud provides many capabilities that enable you to deploy applications significantly more securely and reliably than rolling them out in your own datacenter. Of course there will be resistance to the new way of thinking – there always is. But you can combat this resistance by using information like the suggestions in this series (and in our other blog posts and reference architectures) to highlight the obvious advantages of these new technologies.

—Mike Rothman

Securing Hadoop: Security Recommendations for Hadoop [New Paper]

By Adrian Lane

We are pleased to release our updated white paper on big data security: Securing Hadoop: Security Recommendations for Hadoop Environments. Just about everything has changed in the four years since we published the original. Hadoop has solidified its position as the dominant big data platform, by constantly advancing in function and scale. While the ability to customize a Hadoop cluster to suit diverse needs has been its main driver, the security advances make Hadoop viable for enterprises. Whether embedded directly into Hadoop or deployed as add-on modules, services like identity, encryption, log analysis, key management, cluster validation, and fine-grained authorization are all available. Our goal for this research paper is first to introduce these technologies to IT and security teams, and also to help them assemble these technologies into an coherent security strategy.


This research project provides a high-level overview of security challenges for big data environments. From there we discuss security technologies available for the Hadoop ecosystem, and then sketch out a set of recommendations to secure big data clusters. Our recommendations map threats and compliance requirements directly to supporting technologies to facilitate your selection process. We outline how these tactical responses work within the security architectures which firms employ, tailoring their approaches to the tools and technical talent on hand.

Finally, we would like to thank Hortonworks and Vormetric for licensing this research. Without firms who appreciate our work enough to license our content, we could not bring you quality research free! We hope you find this research helpful in understanding big data and its associated security challenges.

You can download a free copy of the white paper from our research library, or grab a copy directly: Securing Hadoop: Security Recommendations for Hadoop Environments (PDF).

—Adrian Lane

Friday, March 25, 2016

Resilient Cloud Network Architectures: Fundamentals

By Mike Rothman

As much as we like to believe we have evolved as a species, people continue to be scared of things they don’t understand. Yes, many organizations have embraced the cloud whole hog and are rushing headlong into the cloud age. But it’s a big world, and millions of others remain paralyzed – not really understanding cloud computing, and taking the general approach that it can’t be secure because, well, it just can’t. Or it’s too new. Or some for other unfounded and incorrect reason. Kind of like when folks insisted that the Earth was the center of the universe.

This blog series builds on our recent Pragmatic Security for Cloud and Hybrid Networks paper, focusing on cloud-native network architectures that provide security and availability in ways you cannot accomplish in a traditional data center. This evolution will take place over the next decade, and organizations will need to support hybrid networks for some time.

But for those ready, willing, and able to step forward into the future today, the cloud is waiting to break the traditional rules of how technology has been developed, deployed, scaled, and managed. We have been aggressive in proselytizing our belief that the move towards the cloud is the single biggest disruption in technology for the next few decades. Yes, even bigger than the move from mainframes to client/server (we’re old – we know). So our Resilient Cloud Network Architectures series will provide the basics of cloud network security, with a few design patterns to illustrate.

We would like to thank Resilient Systems for provisionally agreeing to license the content in this paper. As always, we’ll build the content using our Totally Transparent Research methodology, mean we will post everything to the blog first, and allow you (our readers) to poke holes in it. Once it has been sufficiently prodded, we will publish a paper for your reference.

Defining Resilient

If we bust out the old dictionary to define resilient, we get:

able to become strong, healthy, or successful again after something bad happens able to return to an original shape after being pulled, stretched, pressed, bent, etc.

In the context of computing, you want to deploy technology that can not just become strong again, but resist attack in the first place. Recoverability is also key: if something bad happens you want to return service quickly, if it causes an outage at all. For network architecture we always fall back on the cloud computing credo: Design for failure. A resilient network architecture both makes it harder to compromise an application and minimizes downtime in case of an issue.

Key aspects of cloud computing which provide security and availability include:

  • Network Isolation: Using the inherent ability of the cloud to restrict connections (via software firewalls, which are called security groups and described below), you can build a network architecture that fully isolates the different tiers of an application stack. That prevents a compromise in one application (or database) from leaking or attacking information stored in another.
  • Account Isolation: Another important feature of the cloud is the ability to use multiple accounts per application. Each of your different environments (Dev, Test, Production, Logging, etc.) can use different accounts, which provides valuable isolation because you cannot access cloud infrastructure across accounts without explicit authorization.
  • Immutability: An immutable server is one that is never logged into or changed in production. In cloud-native DevOps environments servers are deployed in auto-scale groups based on standard images. This prevents human error and configuration drift from creating exploitation paths. You take a new known-good state, and completely replace older images in production. No more patching and no more logging into servers.
  • Regions: You could build multiple data centers around the world to provide redundancy. But that’s not a cheap option, and rarely feasible. To do the same thing in the cloud, you basically just replicate an entire environment in a different region via an API call or a couple clicks in a cloud console. Regions are available all over the world, with multiple availability zones within each, to further minimize single points of failure. You can load balance between zones and regions, leveraging auto-scaling to keep your infrastructure running the same images in real time. We will explain this design pattern in our next post.

The key takeaway is that cloud computing provides architectural options which are either impossible or economically infeasible in a traditional data center, to provide greater protection and availability. This series we will describe the fundamentals of cloud networking for context, and then dig into design patterns which provide both security and availability – which we define as resilience.

Understanding Cloud Networks

The key difference between a network in your data center and one in the cloud is that cloud customers never access the ‘real’ network or hardware. Cloud computing uses virtual networks to abstract the networks you see and manage from the (invisible) underlying physical resources. When your server gets IP address, that IP address does not exist on routing hardware – it’s a virtual address on a virtual network. Everything is handled in software.

Cloud networking varies across cloud providers, but differs from traditional networks in visibility, management, and velocity of change. You cannot tap into a cloud provider’s virtual network, so you’ll need to think differently to monitor your networks. Additionally, cloud networks are typically managed via scripts or programs, making Application Programming Interfaces (API) calls, rather than a graphical console or command line. That enables developers to do pretty much anything, including standing up networks and reconfiguring them – instantly via code.

Finally, cloud networks change much faster than physical networks because cloud environments change faster, including spinning up and shutting down servers via automation. So traditional workflows to govern network change don’t really map to your cloud network. It can be confusing because cloud networks look like traditional networks, with their own routing tables and firewalls. But looks are deceiving – although familiar constructs have been carried over, there are fundamental differences.

Cloud Network Architectures

In order to choose the right solution to address your requirements, you need to understand the types of cloud network architectures and the different technologies that enable them. There are two basic types of cloud network architectures:

  • Public Cloud Networks: These are entirely Internet-facing. You connect to your instances (servers) via the public Internet with no special routing; every instance has a public IP address.
  • Private Cloud Networks: Also called “virtual private clouds” or VPCs, these look like internal LANs using private IP addresses. You access these networks via some kind of non-public connection – typically a VPN.

Cloud networks are enabled and supported by the following technologies:

  • Internet Gateways: A gateway connects your cloud network to the Internet. You don’t normally manage it directly – your cloud provider does it for you because their tools move packets from ‘your’ internal network to the Internet.
  • Internal Gateways: These devices connect existing datacenters to your private cloud network. You access networks via a VPN provided by the cloud provider or a direct connection, which looks a lot like a traditional point-to-point connection from days gone by.
  • Virtual Private Networks: You can also set up your own overlay network to bridge your private and public cloud networks within your cloud provider. This provides a private segment with access for users, developers, and administrators.

These terms will come into play when we present design patterns in our next post.

Network Security Controls

Your network is different in the cloud, so your network security controls will be different as well. But you can take some comfort from the familiar categories. Cloud network controls fall generally into five buckets:

  1. Perimeter Security: These controls generally provide coarse protection from very common network-based attacks, including Denial of Service. Your cloud provider provides and manages these controls; you have no visibility or control.
  2. Software Firewalls: These firewalls are built into the cloud platform (they are called security groups in AWS) and protect cloud assets such as instances. They offer basic access control via ports/protocols and sources/destinations, and are designed to handle auto-scaling and cloud environments. Thy combine the best of network and host firewalls, allowing you to deploy policies on individual servers (or even network interfaces) like a host firewall, but manage them like network firewalls. They will be your main tool to provide virtual network isolation, described above.
  3. Access Control Lists: While a software firewall works at a per-instance (or per-object) level, ACLs restrict communications between subnets of your virtual network. Old-school networking folks will be familiar with using ACLs to control access into and out of the subnets in a (virtual) cloud network.
  4. Virtual Appliances: A number of traditional network security tools, including IDS/IPS, WAF, and NGFW, are available as virtual appliances to improve network security, but they require you to route cloud traffic through these devices.
  5. Host Security Agents: These agents are built into immutable server images, and provide visibility and protection to each server/instance in a cloud environment.

The thing about cloud networking is that you don’t need to apply the same controls, or even configurations, to an entire network. You can make architectural and security control decisions per project. You might decide an entirely cloud-based VPC is best for one application, while for another you choose to build an overlay VPN to connect a totally different VPC to your datacenter to support a hybrid environment. You might need to route one application’s traffic through an inspection point to prevent data leakage, while for another you rely exclusively on security groups to provide full isolation between different layers of your cloud stack. The permutations are infinite, and provide flexibility you cannot have in your data center.

These fundamentals should provide the context you’ll need to understand the design patterns we will present in our next post.

—Mike Rothman

Thursday, March 24, 2016

Incite 3/23/2016: The Madness

By Mike Rothman

I’m not sure why I do it, but every year I fill out brackets for the annual NCAA Men’s College basketball tournament. Over all the years I have been doing brackets, I won once. And it wasn’t a huge pool. It was a small pool in my office, when I used to work in an office, so the winnings probably didn’t even amount to a decent dinner at Fuddrucker’s. I won’t add up all my spending or compare against my winning, because I don’t need a PhD in Math to determine that I am way below the waterline.

Like anyone who always questions everything, I should be asking myself why I continue to play. I’m not going to win – I don’t even follow NCAA basketball. I’d have better luck throwing darts at the wall. So clearly it’s not a money-making endeavor.

extra large bracket

I guess I could ask the same question about why I sit in front of a Wheel of Fortune slot machine in a casino. Or why I buy PowerBall tickets when the pot goes above $200MM. I understand statistics – I know I’m not going to win slots (over time) or the lottery (ever).

They call the NCAA tournament March Madness – perhaps because most people get mad when their brackets blow up on the second day of the tournament when the team they picked to win it all loses to a 15 seed. Or does that just happen to me? But I wasn’t mad. I laughed because 25% of all brackets had Michigan State winning the tournament. And they were all as busted as mine.

These are rhetorical questions. I play a few NCAA tournament brackets every year because it’s fun. I get to talk smack to college buddies about their idiotic picks. I play the slots because my heart races when I spin the wheel and see if I got 35 points or 1,000. I play the lottery because it gives me a chance to dream. What would I do with $200MM?

I’d do the same thing I’m doing now. I’d write. I’d sit in Starbucks, drink coffee, and people-watch, while pretending to write. I’d speak in front of crowds. I’d explore and travel with my loved ones. I’d still play the brackets, because any excuse to talk smack to my buddies is worth the minimal donation. And I’d still play the lottery. And no, I’m not certifiable. I just know from statistics that I wouldn’t have any less chance to win again just because I won before. Score 1 for Math.


Photo credit: “Now, that is a bracket!” from frankieleon

We’ve published this year’s Securosis Guide to the RSA Conference. It’s our take on the key themes of this year’s conference (which is really a proxy for the industry), as well as deep dives on cloud security, threat protection, and data security. And there is a ton of meme goodness… Check out the blog post or download the guide directly (PDF).

The fine folks at the RSA Conference posted the talk Jennifer Minella and I did on mindfulness at the 2014 conference. You can check it out on YouTube. Take an hour. Your emails, alerts, and Twitter timeline will be there when you get back.

Securosis Firestarter

Have you checked out our video podcast? Rich, Adrian, and Mike get into a Google Hangout and… hang out. We talk a bit about security as well. We try to keep these to 15 minutes or less, and usually fail.

Heavy Research

We are back at work on a variety of blog series, so here is a list of the research currently underway. Remember you can get our Heavy Feed via RSS, with our content in all its unabridged glory. And you can get all our research papers too.

Shadow Devices

Building a Vendor IT Risk Management Program

Securing Hadoop

SIEM Kung Fu

Building a Threat Intelligence Program

Recently Published Papers

Incite 4 U

  1. Enough already: Encryption is a safeguard for data. It helps ensure data is used the way its owner intends. We work with a lot of firms – helping them protect data from rogue employees, hackers, malicious government entities, and whoever else may want to misuse their data. We try to avoid touching political topics on this blog, but the current attempt by US Government agencies to paint encryption as a terrorist tool is beyond absurd. They are effectively saying security is a danger, and that has really struck a nerve in the security community. Forget for a minute that the NSA already has all the data that moves on and off your cellphone, and that law enforcement already has the means to access the contents of iPhones without Apple’s assistance. And avoid wallowing in counter-examples where encryption aided freedom, or illustrations of misuse of power to inspire fear in the opposite direction. These arguments devolve into pig-wrestling – only the pig enjoys that sort of thing. As Rich explained in Do We Have a Right To Security?, this is a simple question of whether anyone (companies or individuals) can have security. Currently the US government (at least the executive branch) says ‘No!’ – as does the UK government. – AL

  2. The US blinks… Following up Adrian’s rant above, the US government decided after all that they may not need Apple to open the San Bernadino iPhone after all. Evidently a third party would be happy to sell the US government either an exploit or another means to get access to the locked phone. Duh. Like we didn’t already know that was possible. As many of us argued, this case was much more about establishing a precedent for the FBI than about accessing that specific phone. Now that it looks like an uphill climb to win that motion, it’s time to save face and do what they should have done in the first place. Pay someone to break the phone, if they think it’s that important. We have huge respect for law enforcement and what they do, but we could do with less grandstanding and backdoors. Backdoors are stupid. – MR

  3. Hindsight is 20/20: In the Beretta files goes the case of a Ryan Collins, who was behind the attacks on celebrity iPhones. This is the attacker who stole the pictures. It’s not clear how much he made by selling them, but it was probably not worth the felony violation he will plead to or the associated jail time. They are still looking for the person who actually posted the pictures. But that guy is even dumber – he didn’t make any money, apparently because content wants to be free. All I have to say is: idiots. – MR

  4. Getting chippy: Better than 75% of stores I go into still have tape over their EMV chipped card slots on payment terminals. While it seems merchants are tardy in getting their work done, it’s not always that they are dragging their feet – it may also be the card networks. It appears some merchants who are actively processing EMV cards are getting charged for fraud and chargeback fees because they have yet to complete a certification audit by the card networks. To reverse these charges that supermarket chain filed suit, and is pushing for quick certification. The suit may halt the “liability shift” entirely, which has gotten the card brands’ attention. This entire game of “Pass The Liability” will continue to entertain us until we stop passing credit card numbers around. – AL

  5. Security faith healers: Adam Shostack posted an interesting piece at Dark Reading about how the concepts in The Gluten Lie apply to security. In a nutshell, the health industry has vilified gluten, and besides the people who have legitimate celiac disease, the data doesn’t seem to support the general position that gluten is bad. Adam makes the analogy that telling people to be secure isn’t going to help. Nor is telling them not to do things (like surf pr0n). And folks should drop the fear-based marketing. Yeah, right. A lot of technology marketing is selling snake oil, and it’s as bad in security as anywhere else. But as long as a tactic works (including vilifying gluten to sell more gluten-free stuff) free market economics say that that tactic will continue to be used. Go figure. – MR

—Mike Rothman