Login  |  Register  |  Contact
Thursday, June 30, 2016

Incite 6/29/16: Gone Fishin’ (Proverbially)

By Mike Rothman

It was a great Incite. I wrote it on the flight to Europe for the second leg of my summer vacation. I said magical stuff. Such depth and perspective, I even amazed myself. When I got to the hotel in Florence and went to post the Incite on the blog, it was gone. That’s right: G. O. N. E.

And it’s not going to return. I was sore for a second. But I looked at Mira (she’s the new love I mentioned in a recent Incite) and smiled. I walked outside our hotel and saw the masses gathered to check out the awe-inspiring Duomo. It was hard to be upset, surrounded by such beauty.

It took 3 days to get our luggage after Delta screwed up a rebooking because our flight across the pond was delayed, which made us upset. But losing an Incite? Meh. I was on vacation, so worrying about work just wasn’t on the itinerary.

Over the years, I usually took some time off during the summer when the kids were at camp. A couple days here and there. But I would work a little each day. Convincing myself I needed to stay current, or I didn’t want things to pile up and be buried upon my return. It was nonsense. I was scared to miss something. Maybe I’d miss out on a project or a speaking gig.

Gone fishin'

It turns out I can unplug, and no one dies. I know that because I’m on my way back after an incredible week in Florence and Tuscany, and then a short stopover in Amsterdam to check out the city before re-entering life. I didn’t really miss anything. Though I didn’t really totally unplug either. I checked email. I even responded to a few. But only things that were very critical and took less than 5 minutes.

Even better, my summer vacation isn’t over. It started with a trip to the Jersey shore with the kids. We visited Dad and celebrated Father’s Day with him. That was a great trip, especially since Mira was able to join us for the weekend. Then it was off to Europe. And the final leg will be another family trip for the July 4th holiday. All told, I will be away from the day-to-day grind close to 3 weeks.

I highly recommend a longer break to regain sanity. I understand that’s not really feasible for a lot of people. Fortunately getting space to recharge doesn’t require you to check out for 3 weeks. It could be a long weekend without your device. It could just be a few extra date nights with a significant other. It could be getting to a house project that just never seems to get done. It’s about breaking out of routine, using the change to spur growth and excitement when you return.

So gone fishin’ is really a metaphor, about breaking out of your daily routine to do something different. Though I will take that literally over the July 4 holiday. There will be fishing. There will be beer. And it will be awesome.

For those of you in the US, have a safe and fun July 4. For those of you not, watch the news – there are always a few Darwin Awards given out when you mix a lot of beer with fireworks.


Photo credit: “Gone Fishing” from Jocelyn Kinghorn

Security is changing. So is Securosis. Check out Rich’s post on how we are evolving our business.

We’ve published this year’s Securosis Guide to the RSA Conference. It’s our take on the key themes of this year’s conference (which is really a proxy for the industry), as well as deep dives on cloud security, threat protection, and data security. And there is a ton of meme goodness… Check out the blog post or download the guide directly (PDF).

The fine folks at the RSA Conference posted the talk Jennifer Minella and I did on mindfulness at the 2014 conference. You can check it out on YouTube. Take an hour. Your emails, alerts, and Twitter timeline will be there when you get back.

Securosis Firestarter

Have you checked out our video podcast? Rich, Adrian, and Mike get into a Google Hangout and… hang out. We talk a bit about security as well. We try to keep these to 15 minutes or less, and usually fail.

Heavy Research

We are back at work on a variety of blog series, so here is a list of the research currently underway. Remember you can get our Heavy Feed via RSS, with our content in all its unabridged glory. And you can get all our research papers too.

Managed Security Monitoring

Evolving Encryption Key Management Best Practices

Incident Response in the Cloud Age

Understanding and Selecting RASP

Maximizing WAF Value

Recently Published Papers

Incite 4 U

  1. More equals less? Huh? Security folks are trained that ‘more’ is rarely a good thing. More transactions means more potential fraud. More products means more integration and maintenance cost. Even more people can challenge efficiency. But does more code deploys mean fewer security headaches? Of course the folks from Puppet want you to believe that’s that case, because they are highlighted in this article referring to some customer successes. It turns out our research (including building pipelines and prototyping our own applications) shows that automation and orchestration do result in fewer security issues. It’s about reducing human error. To be clear, if you set up the deployment pipeline badly and screw up the configurations, automation will kill your app. But if you do it right there are huge gains to be had in efficiency, and in reduced attack surface. – MR

  2. A bird in the hand: Jim Bird has a new O’Reilly book called DevOpsSec: Securing Software through Continuous Delivery (PDF). It’s a good primer on the impact of continuous deployment on secure code development. Jim discusses several success stories of early DevOps security initiatives; outlining the challenges of integrating security into the process, the culture, and the code. Jim has contributed a ton of research back to the community over the years, and he is asking for feedback and corrections on the book. So download a free copy, and please help him out. – AL

  3. Stimulating the next security innovations: I mentioned in the last Incite that DARPA was funding some research into the next iteration of DDoS technologies. Not to be outdone, the Intelligence Advanced Research Projects Activity (IARPA) office is looking for some ideas on the evolution of intruder deception. Rich has been interested in these technologies for years, and this is one of the disruptions he laid out in the Future of Security research. It’s clear we won’t be able to totally stop attackers, but we can and should be able to set more traps for them. At least make their job a little harder, and then you aren’t the path of least resistance. And kudos to a number of government agencies putting money up to stimulate innovation needed to keep up with bad folks. – MR

  4. Relevance falling: The PCI Guru asks Is the PCI-DSS even relevant any more?, a question motivated by the better-late-than-never FTC investigation of breaches at major retailers. He argues that with ubiquitous point-to-point and end-to-end encryption (P2PE and E2EE respectively) and tokenization removing credit cards from transactions, the value of a CC# goes down dramatically. Especially because CC# is no longer used for other business processes. We think this assertion is accurate – replace credit cards numbers with systemic tokens from issuing banks, and PCI-DSS’s driver goes out the window. By the way, this is what happens with mobile payments and chipped credit cards: real CC#s no longer pass between merchants and processors. To be clear, without the credit card data – and so long as the Primary Account Reference (PAR) token is not present in the transaction stream – encryption no longer solves a security problem. Neither will PCI-DSS, so we expect it to die gradually, as the value of credit card numbers becomes nil. – AL

  5. Mailing it in: Everyone has too much to do, and not enough skilled resources to do it all. But when you contract with a professional services firm to perform something like an incident response, it would be nice if they actually did the work and professionally documented what they found. It’s not like you aren’t trying to figure out what happened during an attack – both to communicate to folks who lost data, and to make important funding and resource allocation decisions so you can move forward. But what happens when the consultant just does a crappy job? You sue them, of course. Ars Technica offers a good article on how a firm sued TrustWave after their allegedly crappy job was rescued by another services firm. I wasn’t there and didn’t see the report, so I can’t really judge the validity of the claims. But I think a customer standing up to a response firm and calling them out is positive. Consultants beware: you just can’t mail it in – it’s not fair to the customer, and likely to backfire. – MR

—Mike Rothman

Monday, June 27, 2016

Managed Security Monitoring: Use Cases

By Mike Rothman

Many security professionals feel the deck is stacked against them. Adversaries continue to improve their techniques, aided by plentiful malware kits and botnet infrastructures. Continued digitization at pretty much every enterprise means everything of interest in on some system somewhere. Don’t forget the double whammy of mobile and cloud, which democratizes access without geographic boundaries, and takes the one bastion of control, the traditional data center, out of your direct control. Are we having fun yet?

Of course the news isn’t all bad – security has become very high profile. Getting attention and resources can sometimes be a little too easy – life was simpler when we toiled away in obscurity bemoaning that senior management didn’t understand or care about security. That’s clearly not the case today, as you get ready to present the security strategy to the board of directors. Again. And after that’s done you get to meet with the HR team trying to fill your open positions. Again.

In terms of fundamentals of a strong security program, we have always believed in the importance of security monitoring to shorten the window between compromise and detection of compromise. As we posted in our recent SIEM Kung Fu paper:

Security monitoring needs to be a core, fundamental, aspect of every security program.

There are a lot of different concepts of what security monitoring actually is. It certainly starts with log aggregation and SIEM, although many organizations are looking to leverage advanced security analytics (either built into their SIEM or using third-party technology) to provide better and faster detection. But that’s not what we want to tackle in this new series, titled Managed Security Monitoring. It’s not about whether to do security monitoring, it’s a question of the most effective way to monitor resources.

Given the challenges of finding and retaining staff, the increasingly distributed nature of data and systems that need to be monitored, and the rapid march of technology, it’s worth considering whether a managed security monitoring service makes sense for your organization. The fact is that, under the right circumstances, a managed service presents an interesting alternative to racking and stacking another set of SIEM appliances. We will go through drivers, use cases, and deployment architectures for those considering managed services. And we will provide cautions for areas where a service offering might not meet expectations.

As always, our business model depends on forward-looking companies who understand the value of objective research. We’d like to thank IBM Security Systems for agreeing to potentially license this paper once completed. We’ll publish the research using our Totally Transparent Research methodology, which ensures our work is done in an open and accessible manner.

Drivers for Managed Security Monitoring

We have no illusions about the amount of effort required to get a security monitoring platform up and running, or what it takes to keep one current and useful, given the rapid adaptation of attackers and automated attack tools in use today. Many organizations feel stuck in a purgatory of sorts, reacting without sufficient visibility, yet not having time to invest to gain that much-needed visibility into threats. A suboptimal situation, often the initial trigger for discussion of managed services. Let’s be a bit more specific about situations where it’s worth a look at managed security monitoring.

  • Lack of internal expertise: Even having people to throw at security monitoring may not be enough. They need to be the right people – with expertise in triaging alerts, validating exploits, closing simple issues, and knowing when to pull the alarm and escalate to the incident response team. Reviewing events, setting up policies, and managing the system, all take skills that come with training and time with the security monitoring product. Clearly this is not a skill set you can just pick up anywhere – finding and keeping talented people is hard – so if you don’t have sufficient expertise internally, that’s a good reason to check out a service-based alternative.
  • Scalability of existing technology platform: You might have a decent platform, but perhaps it can’t scale to what you need for real-time analysis, or has limitations in capturing network traffic or other voluminous telemetry. And for organizations still using a first generation SIEM with a relational database backend (yes, they are still out there), you face a significant and costly upgrade to scale the system. With a managed service offering scale is not an issue – any sizable provider is handling billions of events per day and scalability of the technology isn’t your problem – so long as the provider hits your SLAs.
  • Predictable Costs: To be the master of the obvious, the more data you put into a monitoring system, the more storage you’ll need. The more sites you want to monitor and the deeper you want visibility into your network, the more sensors you need. Scaling up a security monitoring environment can become costly. One advantage of managed offerings is predictable costs. You know what you’re monitoring and what it costs. You don’t have variable staff costs, nor do you have out-of-cycle capital expenses to deal with new applications that need monitoring.
  • Technology Risk Transference: You have been burned before by vendors promising the world without delivering much of anything. That’s why you are considering alternatives. A managed monitoring service enables you to focus on the functionality you need, instead of trying to determine which product can meet your needs. Ultimately you only need to be concerned with the application and the user experience – all that other stuff is the provider’s problem. Selecting a provider becomes effectively an insurance policy to minimize your technology investment risk. Similarly, if you are worried about your ops team’s ability to keep a broad security monitoring platform up and running, you can transfer operational risk to the provider, who assumes responsibility for uptime and performance – so long as your SLAs are structured properly.
  • Geographically dispersed small sites: Managed services also interest organizations needing to support many small locations without a lot of technical expertise. Think retail and other distribution-centric organizations. This presents a good opportunity for a service provider who can monitor remote sites.
  • Round the clock monitoring: As security programs scale and mature, some organizations decide to move from an 8-hour/5-day monitoring schedule to a round-the-clock approach. Soon after making that decision, the difficult of staffing a security operations center (SOC) 24/7 sets in. A service provider can leverage a 24/7 staffing investment to deliver round-the-clock services to many customers.

Of course you can’t outsource thinking or accountability, so ultimately the buck stops with the internal team, but under the right circumstances managed security monitoring services can address skills and capabilities gaps.

Favorable Use Cases

The technology platform used by the provider may be the equal of an in-house solution, as many providers use commercial monitoring platforms as the basis for their managed services. This is a place for significant diligence during procurement, as we will discuss in our next post. As mentioned above, there are a few use cases where managed security monitoring makes a lot of sense, including:

  • Device Monitoring/Alerting: This is the scaling and skills issue. If you have a ton of network and security devices, but you don’t have the technology or people to properly monitor them, managed security monitoring can help. These services are generally architected to aggregate data on your site and ship it to the service provider for analysis and alerting, though a variety of different options are emerging for where the platform runs and who owns it. Central to this use case is a correlation system to identify issues, a means to find new attacks (typically via a threat intelligence capability) and a bunch of analysts who can triage and validate issues quickly, and then provide an actionable alert.
  • Advanced Detection: With the increasing sophistication of attackers, it can be hard for an organization’s security team to keep pace. A service provider has access to threat intelligence, presumably multiple clients across which to watch for emerging attacks, and the ability to amortize advanced security analytics across customers. Additionally specialized (and expensive) malware researchers can be shared among many customers, making it more feasible for a service provider to employ those resources than many organizations.
  • Compliance Reporting: Another no-brainer for a managed security monitoring alternative is basic log aggregation and reporting – typically driven by a compliance requirement. This isn’t a very complicated use case, and it fits service offerings well. It also gets you out of the business of managing storage and updating reports when a requirement/mandate changes. The provider should take care of all that for you.
  • CapEx vs. OpEx: As much as it may hurt a security purist, buying decisions come down to economics. Depending on your funding model and your organization’s attitude toward capital expenses, leasing a service may be a better option than buying outright. Of course there are other ways to turn a capital purchase into an operational expense, and we’re sure your CFO will have plenty of ideas on that front, but buying a service can be a simple option for avoiding capital expenditure. Obviously, given the long and involved process to select a new security monitoring platform, you must make sure the managed service meets your needs before economic considerations come into play – especially if there’s a risk of Accounting’s preferences driving you to spend big on an unsuitable product. No OpEx vs. CapEx tradeoff can make a poorly matched service offering meet your requirements.

There are other offerings and situations where managed security monitoring makes sense, which have nothing to do with the nice clean buckets above. We have seen implementations of all shapes and sizes, and we need to avoid overgeneralizing. But the majority of service implementations fit these general use cases.

Unfavorable Use Cases

Of course there are also situations where a monitoring service may not be a good fit. That doesn’t mean you can’t use a service because of extenuating circumstances, typically having to do with a staffing and skills gap. But generally these situations don’t make for the best fit for a service:

  • Dark Networks: Due to security requirements, some networks are dark, meaning no external access is available. These are typically highly sensitive military and/or regulated environments. Clearly this is problematic for a security monitoring service because the provider cannot access the customer network. To address skills gaps you’d instead consider a dedicated onsite resource and either buying a security monitoring platform yourself or leasing it from the provider.
  • Highly Sensitive IP: On networks where the intellectual property is particularly valuable, the idea of providing access to external parties is usually a non-starter. Again, this situation would call for dedicated on-site resources helping to run your on-premise security monitoring platform.
  • Large Volumes of Data: If your organization is very large and has a ton of logs and other telemetry for security monitoring, this can challenge a service offering that requires data to be moved to a cloud-based service, including network forensics and packet analytics. In this case an on-premise monitoring service will likely be the best solution. Note the new hybrid offerings which capture data and perform security analytics on-premise using resources in a shared SOC. We’ll discuss these hybrid offerings in our next post.

As with the favorable use cases, the unfavorable use cases are strong indicators but not absolute. It really depends on the specific requirements of your situation, your ability to invest in technology, and the availability of skilled resources.

These generalizations should give you a starting point to consider a managed security monitoring service. Our next post will get into specifics of selection criteria, service levels, and deployment models.

—Mike Rothman

Friday, June 24, 2016

Summary: Modifying rsyslog to Add Cloud Instance Metadata

By Rich

Rich here.

Quick note: I basically wrote an entire technical post for Tool of the Week, so feel free to skip down if that’s why you’re reading.

Ah, summer. As someone who works at home and has children, I’m learning the pains of summer break. Sure, it’s a wonderful time without homework fights and after-school activities, but it also means all 5 of us in the house nearly every day. It’s a bit distracting. I mean do you have any idea how to tell a 3-year-old you cannot ditch work to play Disney Infinity on the Xbox?

Me neither, which explains my productivity slowdown.

I’ve actually been pretty busy at ‘real work’, mostly building content for our new Advanced Cloud Security course (it’s sold out, but we still have room in our Hands-On class). Plus a bunch of recent cloud security assessments for various clients. I have been seeing some interesting consistencies, and will try to write those up after I get these other projects knocked off. People are definitely getting a better handle on the cloud, but they still tend to make similar mistakes.

With that, let’s jump right in…

Top Posts for the Week

Tool of the Week

I’m going to detour a bit and focus on something all you admin types are very familiar with: rsyslog. Yes, this is the default system logger for a big chunk of the Linux world, something most of us don’t think that much about. But as I build out a cloud logging infrastructure I found I needed to dig into it to make some adjustments, so here is a trick to insert critical Amazon metadata into your logs (usable on other platforms, but I can only show so many examples).

Various syslog-compatible tools generate standard log files and allow you to ship them off to a remote collector. That’s the core of a lot of performance and security monitoring. By default log lines look something like this:

 Jun 24 00:21:27 ip-172-31-40-72 sudo: ec2-user : TTY=pts/0 ; PWD=/var/log ; USER=root ; COMMAND=/bin/cat secure

That’s the line outputting the security log from a Linux instance. See a problem?

This log entry includes the host name (internal IP address) of the instance, but in the cloud a host name or IP address isn’t nearly as canonical as in traditional infrastructure. Both can be quite ephemeral, especially if you use auto scale groups and the like. Ideally you capture the instance ID or equivalent on other platforms, and perhaps also some other metadata such as the internal or external IP address currently associated with the instance. Fortunately it isn’t hard to fix this up.

The first step is to capture the metadata you want. In AWS just visit:

To get it all. Or use something like:


to get the instance ID. Then you have a couple options. One is to change the host name to be the instance ID. Another is to append it to entries by changing the rsyslog configuration (/etc/rsyslog.conf on CentOS systems), as in the below to add a %INSTANCEID% environment variable to the hostname (yes, this means you need to set INSTANCEID as an environment variable, and I haven’t tested this because I need to post the Summary before I finish, so you might need a little more text manipulation to make it work… but this should be close):

 template(name="forwardFormat" type="string"
          string="<%PRI%>%TIMESTAMP:::date-rfc3339% %INSTANCEID%-%HOSTNAME% %syslogtag:1:32%%msg:::sp-if-no-1st-sp%%msg%"

There are obviously a ton of ways you could slice this, and you need to add it to your server build configurations to make it work (using Ansible/Chef/Puppet/packer/whatever). But the key is to capture and embed the instance ID and whatever other metadata you need. If you don’t care about strict syslog compatibility, you have more options. The nice thing about this approach is that it will capture all messages from all the system sources you normally log, and you don’t need to modify individual message formats.

If you use something like the native Amazon/Azure/Google instance logging tools… you don’t need to bother with any of this. Those tools tend to capture the relevant metadata for you (e.g., using Amazon’s CloudWatch logs agent, Azure’s Log Analyzer, or Google’s StackDriver). Check the documentation to make sure you get them correct. But many clients want to leverage existing log management, so this is one way to get the essential data.

Securosis Blog Posts this Week

Other Securosis News and Quotes

Another quiet week…

Training and Events


Wednesday, June 15, 2016

Shining a Light on Shadow Devices [New Paper]

By Mike Rothman

Visible devices are only some of the network-connected devices in your environment. There are hundreds, quite possibly thousands, of other devices you don’t know about on your network. You don’t scan them periodically, and you have no idea of their security posture. Each one can be attacked, and might provide an adversary with opportunity to gain presence in your environment. Your attack surface is much larger than you thought. In our Shining a Light on Shadow Devices paper, we discuss the attacks on these devices which can become an issue on your network, along with some tactics to provide visibility and then control to handle all these network-connected devices.

SHD Cover

We would like to thank ForeScout Technologies for licensing the content in this paper. Our unique Totally Transparent Research model enables us to think objectively about future attack vectors and speculate a bit on the impact to your organization, without paywalls or other such gates restricting access to research you may need.

You can get the paper from the landing page in our research library.

—Mike Rothman

Monday, June 13, 2016

Understanding and Selecting RASP: Buyers Guide

By Adrian Lane

Before we jump into today’s post, we want to thank Immunio for expressing interest in licensing this content. This type of support enables us to bring quality research to you, free of charge. If you are interested in licensing this Securosis research as well, please let us know. And we want to thank all of you who have been commenting throughout this series – we have received many good comments and questions. We have in fact edited most of the posts to integrate your feedback, and added new sections to address your questions. This research is certainly better for it! And it’s genuinely helpful that the community at large can engage is an open discussion, so thanks again to all you who have participated.

We will close out this series by directing your attention to several key areas for buyers to evaluate, in order to assess suitability for your needs. With new technologies it is not always clear where the ‘gotchas’ are. We find many security technologies meet basic security goals, but after they have been on-premise for some time, you discover management or scalability nightmares. To help you avoid some of these pitfalls, we offer the following outline of evaluation criteria. The product you choose should provide application protection, but it should also be flexible enough to work in your environment. And not just during Proof of Concept (PoC) – every day.

  • Language Coverage: Your evaluation should ensure that the RASP platforms you are considering all cover the programming languages and platforms you use. Most enterprises we speak with develop applications on multiple platforms, so ensure that there is appropriate coverage for all your applications – not just the ones you focus on during the evaluation process.
  • Blocking: Blocking is a key feature. Sure, some of you will use RASP for monitoring and instrumentation – at least in the short term – but blocking is a huge part of RASP’s value. Without blocking there is no protection – even more to the point, get blocking wrong and you break applications. Evaluating how well a RASP product blocks is essential. The goal here is twofold: make sure the RASP platform is detecting the attacks, and then determine if its blocking action negatively affects them. We recommend penetration testing during the PoC, both to verify that common attack vectors are handled, and to gauge RASP behavior when attacks are discovered. Some RASPs simply block the request and return an error message to the user. In some cases RASP can alter a request to make it benign, then proceed as normal. Some products alter user sessions and redirect users to login again, or jump through additional hoops before proceeding. Most RASP products provide customers a set of options for how they should respond to different types of attacks. Most vendors consider attack detection techniques part of their “secret sauce”, so we are unable to offer insight into the differences. But just as important is how well application continuity is preserved when responding to threats, which you can monitor directly during evaluation.
  • Policy Coverage: It’s not uncommon for one or more members of a development team to be proficient with application security. That said, it’s unreasonable to expect developers to understand the nuances of new attacks and the details behind every CVE. Vulnerability research, methods of detection, and appropriate methods to block attacks are large parts of the value each RASP vendor provides. Your vendor spends days – if not weeks – developing each policy embedded into their tool. During evaluation, it’s important to ensure that critical vulnerabilities are addressed. But it is arguably more important to determine how – and how often – vendors update policies, and verify they include ongoing coverage. A RASP product cannot better than its policies, so ongoing support is critical as new threats are discovered.
  • Policy Management: Two facets of policy management come up most often during our discussions. The first is identification of which protections map to specific threats. Security, risk, and compliance teams all ask, “Are we protected against XYZ threat?” You will need to show that you are. Evaluate policy lookup and reporting. The other is tuning how to respond to threats. As we mentioned above under ‘Blocking’, most vendors allow you to tune responses either by groups of issues, or on a threat-by-threat basis. Evaluate how easy this is to use, and whether you have sufficient options to tailor responses.
  • Performance: Being embedded into applications enables RASP to detect threats at different locations within your app, with context around the operation being performed. This context is passed. along with the user request, to a central enforcement point for analysis. The details behind detection vary widely between vendors, so performance varies as well. Each user request may generate dozens of checks, possibly including multiple external references. This latency can easily impact user experience, so sample how long analysis takes. Each code path will apply a different set of rules, so you will need to test several different paths, measuring both with and without RASP. You should do this under load to ensure that detection facilities do not bottleneck application performance. And you’ll want to understand what happens when some portion of RASP fails, and how it responds – does it “fail open”?
  • Scalability: Most web applications scale by leveraging multiple application instances, distributing user requests distributed via a load balancer. As RASP is typically built into the application, it scales right along with it, without need for additional changes. But if RASP leverages external threat intelligence, you will want to verify this does not hamper scalability. For RASP platforms where the point of analysis – as opposed to the point of interception – is outside your application, you need to verify how the analysis component scales. For RASP products that work as a cloud service using non-deterministic code inspection, evaluate how their services scale.
  • API Compatibility: Most interest in RASP is prompted by a desire to integrate into application development processes, automating security deployment alongside application code, so APIs are a central feature. Ensure the RASP products you consider are compatible with Jenkins, Ansible, Chef, Puppet, or whatever automated build tools you employ. On the back end make sure RASP feeds information back into your systems for defect tracking, logging, and Security Information and Event Management (SIEM). This data is typically available in JSON, syslog, and other formats, but ensure each product provides what you need.

That concludes our series on RASP. As always, we encourage comments, questions and critique, so please let us know what’s on your mind.

—Adrian Lane

Getting the SWIFT Boot

By Mike Rothman

As long as I have been in security and following the markets, I have observed that no one says security is unimportant. Not out loud, anyway. But their actions usually show a different view. Maybe there is a little more funding. Maybe somewhat better visibility at the board level. But mostly security gets a lot of lip service.

In other words, security doesn’t matter. Until it does.


The international interbank payment system called SWIFT has successfully been hit multiple times by hackers, and a few other attempts have been foiled. Now they are going to start turning the screws on member banks, because SWIFT has finally realized they can be very secure but still get pwned. It doesn’t help when the New York Federal Reserve gets caught up in a ruse due to lax security at a bank in Bangladesh.

So now the lip service is becoming threats. That member banks will have their access to SWIFT revoked if they don’t maintain a sufficient security posture. Ah, more words. Will this be like the words uttered every time someone asks if security is important? Or will there be actual action behind them?

That action needs to include specific guidance on what security actually looks like. This is especially important for banks in emerging countries, which may not have a good idea of where to start. And yes, those organizations are out there. The action also needs to involve some level of third-party assessment. Self-assessment doesn’t cut it.

I think SWIFT can take a page from the Payment Card Industry. The initial PCI-DSS, and the resulting work to get laggards over a (low) security bar did help. It’s not an ongoing sustainable answer, because at some point the assessments became a joke and the controls required by the standard have predictably failed to keep pace with attacks.

But security at a lot of these emerging banks is a dumpster fire. And the folks who work with them realize where the weakest links are. But actions speak much louder than words, so watch for actions.

Photo credit: “Boots” originally uploaded by Rob Pongsajapan

—Mike Rothman

Friday, June 10, 2016

Summary: June 10, 2016

By Adrian Lane

Adrian here.

A phone call about Activity Monitoring administrative actions on mainframes, followed by a call on security architectures for new applications in AWS. A call on SAP vulnerability scans, followed by a call on Runtime Application Self-Protection. A call on protecting relational databases against SQL injection, followed by a discussion of relevant values to key security event data for a big data analytics project. Consulting with a firm which releases code every 12 months, and discussing release management with a firm that is moving to two-a-day in a continuous deployment model. This is what my call logs look like.

If you want to see how disruptive technology is changing security, you can just look at my calendar. On any given day I am working at both extremes in security. On one hand we have the old and well-worn security problems; familiar, comfortable and boring. On the other hand we have new security problems, largely part driven by cloud and mobile technologies, and the corresponding side-effects – such as hybrid architectures, distributed identity management, mobile device management, data security for uncontrolled environments, and DevOps. Answers are not rote, problems do not always have well-formed solutions, and crafting responses takes a lot of work. Worse, the answer I gave yesterday may be wrong tomorrow, if the pace of innovation invalidates my answer. This is our new reality.

Some days it makes me dizzy, but I’ve embraced the new, if for no other reason that to avoid being run over by it. It’s challenging as hell, but it’s not boring.

On to this week’s summary:

If you want to subscribe directly to the Friday Summary only list, just click here.

Top Posts for the Week

Tool of the Week

I decided to take some to and learn about tools more common to clouds other than AWS. I was told Kubernetes was the GCP open source version of Docker, so I though that would be a good place to start. After I spent some time playing with it, I realized what I was initially told was totally wrong! Kubernetes is called a “container manager”, but it’s really focused on setting up services. Docker focuses on addressing app dependencies and packaging; Kubernetes on app orchestration. And it runs anywhere you want – not just GCP and GCE, but in other clouds or on-premise. If you want to compare Kubernetes to something in the Docker universe, it’s closest to Docker Swarm, which tackles some of the management and scalability issues.

Kubernetes has three basic parts: controllers that handle things like replication and pod behaviors; a simple naming system – essentially using key-value pairs – to identify pods; and a services directory for discovery, routing, and load balancing. A pod can be one or more Docker containers, or a standalone application. These three primitives make it pretty easy to stand up code, direct application requests, manage clusters of services, and provide basic load balancing. It’s open source and works across different clouds, so your application should work the same on GCP, Azure, or AWS. It’s not super easy to set up, but it’s not a nightmare either. And it’s incredibly flexible – once set up, you can easily create pods for different services, with entirely different characteristics.

A word of caution: if you’re heavily invested in Docker, you might instead prefer Swarm. Early versions of Kubernetes seemed to have Docker containers in mind, but the current version does not integrate with native Docker tools and APIs, so you have to duct tape some stuff together to get Docker compliant containers. Swarm is compliant with Docker’s APIs and works seamlessly. But don’t be swayed by studies that compare container startup times as a main measure of performance; that is one of the least interesting metrics for comparing container management and orchestration tools. Operating performance, ease of use, and flexibility are all far more important. If you’re not already a Docker shop, check out Kubernetes – its design is well-thought-out and purpose-built to tackle micro-service deployment. And I have not yet had a chance to use Google’s Container Engine, but it is supposed to make setup easier, with a number of supporting services.

Securosis Blog Posts this Week

Other Securosis News and Quotes

Training and Events

—Adrian Lane

Thursday, June 09, 2016

Building Resilient Cloud Network Architectures [New Paper]

By Mike Rothman

Building Resilient Cloud Network Architectures builds on our Pragmatic Security Cloud and Hybrid Networks research, focusing on cloud-native network architectures that provide security and availability infeasible in a traditional data center. The key is that cloud computing provides architectural options which are either impossible or economically infeasible in traditional data centers, enabling greater protection and better availability.

RCNA Cover

We would like to thank Resilient Systems, an IBM Company, for licensing the content in this paper. We built the paper using our Totally Transparent Research model, leveraging what we’ve learned building cloud applications over the past 4 years.

You can get the paper from the landing page in our research library.

—Mike Rothman

Wednesday, June 08, 2016

Evolving Encryption Key Management Best Practices: Use Cases

By Rich

This is the third in a three-part series on evolving encryption key management best practices. The first post is available here. This research is also posted at GitHub for public review and feedback. My thanks to Hewlett Packard Enterprise for licensing this research, in accordance with our strict Totally Transparent Research policy, which enables us to release our independent and objective research for free.

Use Cases

Now that we’ve discussed best practices, it’s time to cover common use cases. Well, mostly common – one of our goals for this research is to highlight emerging practices, so a couple of our use cases cover newer data-at-rest key management scenarios, while the rest are more traditional options.

Traditional Data Center Storage

It feels a bit weird to use the word ‘traditional’ to describe a data center, but people give us strange looks when we call the most widely deployed storage technologies ‘legacy’. We’d say “old school”, but that sounds a bit too retro. Perhaps we should just say “big storage stuff that doesn’t involve the cloud or other weirdness”.

We typically see three major types of data storage encrypted at rest in traditional data centers: SAN/NAS, backup tapes, and databases. We also occasionally we also see file servers encrypted, but they are in the minority. Each of these is handled slightly differently, but normally one of three ‘meta-architectures’ is used:

  • Silos: Some storage tools include their own encryption capabilities, managed within the silo of the application/storage stack. For example a backup tape system with built-in encryption. The keys are managed by the tool within its own stack. In this case an external key manager isn’t used, which can lead to a risk of application dependency and key loss, unless it’s a very well-designed product.
  • Centralized key management: Rather than managing keys locally, a dedicated central key management tool is used. Many organizations start with silos, and later integrate them with central key management for advantages such as improved separation of duties, security, auditability, and portability. Increasing support for KMIP and the PKCS 11 standards enables major products to leverage remote key management capabilities, and exchange keys.
  • Distributed key management: This is very common when multiple data centers are either actively sharing information or available for disaster recovery (hot standby). You could route everything through a single key manager, but this single point of failure would be a recipe for disaster. Enterprise-class key management products can synchronize keys between multiple key managers. Remote storage tools should connect to the local key manager to avoid WAN dependency and latency. The biggest issue with this design is typically ensuring the different locations synchronize quickly enough, which tends to be more of an issue for distributed applications balanced across locations than for a hot standby sites, where data changes don’t occur on both sides simultaneously. Another major concern is ensuring you can centrally manage the entire distributed deployment, rather than needing to log into each site separately.

Each of those meta-architectures can manage keys for all of the storage options we see in use, assuming the tools are compatible, even using different products. The encryption engine need not come from the same source as the key manager, so long as they are able to communicate.

That’s the essential requirement: the key manager and encryption engines need to speak the same language, over a network connection with acceptable performance. This often dictates the physical and logical location of the key manager, and may even require additional key manager deployments within a single data center. But there is never a single key manager. You need more than one for availability, whether in a cluster or using a hot standby.

As we mentioned under best practices, some tools support distributing only needed keys to each ‘local’ key manager, which can strike a good balance between performance and security.



There are as many different ways to encrypt an application as there are developers in the world (just ask them). But again we see most organizations coalescing around a few popular options:

  • Custom: Developers program their own encryption (often using common encryption libraries), and design and implement their own key management. These are rarely standards-based, and can become problematic if you later need to add key rotation, auditing, or other security or compliance features.
  • Custom with external key management: The encryption itself is, again, programmed in-house, but instead of handling key management itself, the application communicates with a central key manager, usually using an API. Architecturally the key manager needs to be relatively close to the application server to reduce latency, depending on the particulars of how the application is programmed. In this scenario, security depends strongly on how well the application is programmed.
  • Key manager software agent or SDK: This is the same architecture, but the application uses a software agent or pre-configured SDK provided with the key manager. This is a great option because it generally avoids common errors in building encryption systems, and should speed up integration, with more features and easier management. Assuming everything works as advertised.
  • Key manager based encryption: That’s an awkward way of saying that instead of providing encryption keys to applications, each application provides unencrypted data to the key manager and gets encrypted data in return, and vice-versa.

We deliberately skipped file and database encryption, because they are variants of our “traditional data center storage” category, but we do see both integrated into different application architectures.

Based on our client work (in other words, a lot of anecdotes), application encryption seems to be the fastest growing option. It’s also agnostic to your data center architecture, assuming the application has adequate access to the key manager. It doesn’t really care whether the key manager is in the cloud, on-premise, or a hybrid.

Hybrid Cloud

Speaking of hybrid cloud, after application encryption (usually in cloud deployments) this is where we see the most questions. There are two main use cases:

  • Extending existing key management to the cloud: Many organizations already have a key manager they are happy with. As they move into the cloud they may either want to maintain consistency by using the same product, or need to support a migrating application without having to gut their key management to build something new. One approach is to always call back over the network to the on-premise key manager. This reduces architectural changes (and perhaps additional licensing), but often runs into latency and performance issues, even with a direct network connection. Alternatively you can deploy a virtual appliance version of your key manager as a ‘bastion’ host, and synchronize keys so assets in the cloud connect to the distributed virtual server for better performance.
  • Building a root of trust for cloud deployments: Even if you are fully comfortable deploying your key manager in the cloud, you may still want an on-premise key manager to retain backups of keys or support interoperability across cloud providers.

Generally you will want to run a virtual version of your key manager within the cloud to satisfy performance requirements, even though you could route all requests back to your data center. It’s still essential to synchronize keys, backups, and even logs back on-premise or to multiple, distributed cloud-based key managers, because no single instance or virtual machine can provide sufficient reliability.

Hybrid Cloud

Bring Your Own Key

This is a very new option with some cloud providers who allow you to use an encryption service or product within their cloud, while you retain ownership of your keys. For example you might provide your own file encryption key to your cloud provider, who then uses it to encrypt your data, instead of using a key they manage.

The name of the game here is ‘proprietary’. Each cloud provider offers different ways of supporting customer-managed keys. You nearly always need to meet stringent network and location requirements to host your key manager yourself, or you need to use your cloud provider’s key management service, configured so you can manage your keys yourself.


Incite 6/7/2016: Nature

By Mike Rothman

Like many of you, I spend a lot of time sitting on my butt banging away at my keyboard. I’m lucky that the nature of my work allows me to switch locations frequently, and I can choose to have a decent view of the world at any given time. Whether it’s looking at a wide assortment of people in the various Starbucks I frequent, my home office overlooking the courtyard, or pretty much any place I can open my computer on my frequent business travels. Others get to spend all day in their comfy (or not so comfy) cubicles, and maybe stroll to the cafeteria once a day.

I have long thought that spending the day behind a desk isn’t the most effective way to do things. Especially for security folks, who need to be building relationships with other groups in the organization and proselytizing the security mindset. But if you are reading this, your job likely involves a large dose of office work. Even if you are running from meeting to meeting, experiencing the best conference rooms, we spend our days inside breathing recycled air under the glare of florescent lights.

Panther Falls, GA

Every time I have the opportunity to explore nature a bit, I remember how cool it is. Over the long Memorial Day weekend, we took a short trip up to North Georgia for some short hikes, and checked out some cool waterfalls. The rustic hotel where we stayed didn’t have cell service (thanks AT&T), but that turned out to be great. Except when Mom got concerned because she got a message that my number was out of service. But through the magic of messaging over WiFi, I was able to assure her everything was OK. I had to exercise my rusty map skills, because evidently the navigation app doesn’t work when you have no cell service. Who knew?

It was really cool to feel the stress of my day-to-day activities and responsibilities just fade away once we got into the mountains. We wondered where the water comes from to make the streams and waterfalls. We took some time to speculate about how long it took the water to cut through the rocks, and we were astounded by the beauty of it all. We explored cute towns where things just run at a different pace. It really put a lot of stuff into context for me. I (like most of you) want it done yesterday, whatever we are talking about.

Being back in nature for a while reminded me there is no rush. The waterfalls and rivers were there long before I got here. And they’ll be there long after I’m gone. In the meantime I can certainly make a much greater effort to take some time during the day and get outside. Even though I live in a suburban area, I can find some green space. I can consciously remember that I’m just a small cog in a very large ecosystem. And I need to remember that the waterfall doesn’t care whether I get through everything on my To Do list. It just flows, as should I.


Photo credit: “Panther Falls - Chattachoochee National Forest” - Mike Rothman May 28, 2016

Security is changing. So is Securosis. Check out Rich’s post on how we are evolving our business.

We’ve published this year’s Securosis Guide to the RSA Conference. It’s our take on the key themes of this year’s conference (which is really a proxy for the industry), as well as deep dives on cloud security, threat protection, and data security. And there is a ton of meme goodness… Check out the blog post or download the guide directly (PDF).

The fine folks at the RSA Conference posted the talk Jennifer Minella and I did on mindfulness at the 2014 conference. You can check it out on YouTube. Take an hour. Your emails, alerts, and Twitter timeline will be there when you get back.

Securosis Firestarter

Have you checked out our video podcast? Rich, Adrian, and Mike get into a Google Hangout and… hang out. We talk a bit about security as well. We try to keep these to 15 minutes or less, and usually fail.

Heavy Research

We are back at work on a variety of blog series, so here is a list of the research currently underway. Remember you can get our Heavy Feed via RSS, with our content in all its unabridged glory. And you can get all our research papers too.

Evolving Encryption Key Management Best Practices

Incident Response in the Cloud Age

Understanding and Selecting RASP

Maximizing WAF Value

Shadow Devices

Recently Published Papers

Incite 4 U

  1. Healthcare endpoints are sick: Not that we didn’t already know, given all the recent breach notifications from healthcare organizations, but they are having a tough time securing their endpoints. The folks at Duo provide some perspective on why. It seems those endpoints log into twice as many apps, and a large proportion are based on leaky technology like Flash and Java. Even better, over 20% use unsupported (meaning unpatched) versions of Internet Explorer. LOL. What could possibly go wrong? I know it’s hard, and I don’t mean to beat up on our fine healthcare readers. We know there are funding issues, the endpoints are used by multiple people, and they are in open environments where almost anyone can go up and mess around with them. And don’t get me started on the lack of product security in too many medical systems and products. But all the same, it’s not like they have access to important information or anything. Wait… Oh, they do. Sigh. – MR

  2. Insecure by default: Scott Schober does a nice job outlining Google’s current thinking on data encryption and the security of users’ personal data. Essentially for the new class of Google’s products, the default is to disable end-to-end encryption. You do have the option of turning it on, but Google still manages the encryption keys (unlike Apple). But their current advertising business model, and the application of machine learning to aid users beyond what’s provided today, pretty much dictate Google’s need to collect and track personally identifiable information. Whether that is good or bad is in the eye of the beholder, but realize that when you plunk a Google Home device into your home, it’s always listening and will capture and analyze everything. We now understand that at the very least the NSA siphons off all content sent to the Google cloud, so we recommend enabling end-to-end encryption, which forces intelligence and law enforcement to crack the encryption or get a warrant to view personal information. Even though this removes useful capabilities. – AL

  3. Moby CEO: It looks like attackers are far better at catching whales than old Ahab. In what could be this year’s CEO cautionary tale (after the Target incident a few years back), an Austrian CEO got the ax because he got whaled to the tune of $56MM. Yes, million (US dollars, apparently). Of course if a finance staffer is requested to transfer millions in [$CURRENCY], there should be some means of verifying the request. It is not clear where the internal controls failed in this case. All the same, you have to figure that CEO will have “confirm internal financial controls” at the top of his list at his next gig. If there is one. – MR

  4. Tagged and tracked: It’s fascinating to watch the number of ways users’ online activity can be tracked, with just about every conceivable browser plug-in and feature minable for user identity and activity. A recent study from Princeton University called The Long Tail of Online Tracking outlines the who, what, and how of tracking software. It’s no surprise that Google, Facebook, and Twitter are tracking users on most sites. What is surprising is that many sites won’t load under the HTTPS protocol, and degenerate to HTTP to ensure content sharing with third parties. As is the extent to which tracking firms go to identify your devices – using AudioContext, browser configuration, browser extensions, and just about everything else they can access to build a number of digital fingerprints to identify people. If you’re interested in the science behind this, that post links to a variety of research, as well as the Technical Analysis of client identification mechanisms from the Google Chromium Security team. And they should know how to identify users (doh!). – AL

  5. Why build it once when you can build it 6 times? I still love that quote from the movie Contact. “Why build it once, when you can build it twice for twice the price?” Good thing they did when the first machine was bombed. It seems DARPA takes the same approach – they are evidently underwriting 6 different research shops to design a next generation DDoS defense. It’s not clear (from that article, anyway) whether the groups were tasked with different aspects of a larger solution. DDoS is a problem. But given the other serious problems facing IT organizations, is it the most serious? It doesn’t seem like it to me. But all the same, if these research shops make some progress, that’s a good thing and it’s your tax dollars at work (if you pay taxes in the US, anyway). – MR

—Mike Rothman

Tuesday, June 07, 2016

Mr. Market Loves Ransomware

By Mike Rothman

The old business rule is: when something works, do more of it. By that measure ransomware is clearly working. One indication is the number of new domains popping up which are associated with ransomware attacks. According to an Infoblox research report (and they provide DNS services, so they should know), there was a 35x increase in ransomware domains in Q1.

You have also seen the reports of businesses getting popped when an unsuspecting employee falls prey to a ransomware attack; the ransomware is smart enough to find a file share and encrypt all those files too. And even when an organization pays, the fraudster is unlikely to just give them the key and go away.

This is resulting in real losses to organizations – the FBI says organizations lost over $200 million in Q1 2016. Even if that number is inflated, it’s a real business, so you will see a lot more of it. The attackers follow Mr. Market’s lead, and clearly the ‘market’ loves ransomware right now.

So what can you do? Besides continue to train employees not to click stuff? An article at NetworkWorld claims to have the answer for how to deal with ransomware. They mention strategies for trying to recover faster via “regular and consistent backups along with tested and verified restores.” This is pretty important – just be aware that you may be backing up encrypted files, so make sure you have backups from far enough back that you can recover the files before the attack. This is obvious in retrospect, but backup/recovery is a good practice regardless of whether you are trying to deal with malware, ransomware, or hardware failure that puts data at risk.

Their other suggested defense is to prevent the infection. The article’s prescribed approach is application whitelisting (AWL). We are fans of AWL in specific use cases – here the ransomware wouldn’t be allowed to run on devices, because it’s not authorized. Of course the deployment issues with AWL, given how it can impact user experience, are well known. Though we do find whitelisting appropriate for devices that don’t change frequently or which hold particularly valuable information, so long as you can deal with the user resistance.

They don’t mention other endpoint protection solutions, such as isolation on endpoint devices. We have discussed the various advanced endpoint defense strategies, and will be updating that research over the next couple of months. Adding to the confusion, every endpoint defense vendor seems to be shipping a ‘ransomware’ solution… which is really just their old stuff, rebranded.

So what’s the bottom line? If you have an employee who falls prey to ransomware, you are going to lose data. The question is: How much? With advanced prevention technologies deployed, you may stop some of the attacks. With a solid backup strategy, you may minimize the amount of data you lose. But you won’t escape unscathed.

—Mike Rothman

Monday, June 06, 2016

Building a Vendor (IT) Risk Management Program [New Paper]

By Mike Rothman

In Building a Vendor (IT) Risk Management Program, we explain why you can no longer ignore the risk presented by third-party vendors and other business partners, including managing an expanded attack surface and new regulations demanding effective management of vendor risk. We then offer ideas for how to build a structured and systematic program to assess vendor (IT) risk, and take action when necessary.

VRM Cover

We would like to thank BitSight Technologies for licensing the content in this paper. Our unique Totally Transparent Research model allows us to perform objective and useful research without requiring paywalls or other such nonsense, which make it hard for the people who need our research to get it. A day doesn’t go by where we aren’t thankful to all the companies who license our research.

You can get the paper from the landing page in our research library.

—Mike Rothman

Friday, June 03, 2016

Evolving Encryption Key Management Best Practices: Part 2

By Rich

This is the second in a four-part series on evolving encryption key management best practices. The first post is available here. This research is also posted at GitHub for public review and feedback. My thanks to Hewlett Packard Enterprise for licensing this research, in accordance with our strict Totally Transparent Research policy, which enables us to release our independent and objective research for free.

Best Practices

If there is one thread tying together all the current trends influencing data centers and how we build applications, it’s distribution. We have greater demand for encryption in more locations in our application stacks – which now span physical environments, virtual environments, and increasing barriers even within our traditional environments.

Some of the best practices we will highlight have long been familiar to anyone responsible for enterprise encryption. Separation of duties, key rotation, and meeting compliance requirements have been on the checklist for a long time. Others are familiar, but have new importance thanks changes occurring in data centers. Providing key management as a service, and dispersing and integrating into required architectures aren’t technically new, but they are in much greater demand than before. Then there are the practices which might not make the list, such as supporting APIs and distributed architectures (potentially spanning physical and virtual appliances).

As you will see, the name of the game is consolidation for consistency and control; simultaneous with distribution to support diverse encryption needs, architectures, and project requirements.

But before we jump into recommendations, keep our focus in mind. This research is for enterprise data centers, including virtualization and cloud computing. There are plenty of other encryption use cases out there which don’t necessarily require everything we discuss, although you can likely still pick up a few good ideas.

Build a key management service

Supporting multiple projects with different needs can easily result in a bunch of key management silos using different tools and technologies, which become difficult to support. One for application data, another for databases, another for backup tapes, another for SANs, and possibly even multiple deployments for the same functions, as individual teams pick and choose their own preferred technologies. This is especially true in the project-based agile world of the cloud, microservices, and containers. There’s nothing inherently wrong with these silos, assuming they are all properly managed, but that is unfortunately rare. And overlapping technologies often increase costs.

Overall we tend to recommend building centralized security services to support the organization, and this definitely applies to encryption. Let a smaller team of security and product pros manage what they are best at and support everyone else, rather than merely issuing policy requirements that slow down projects or drive them underground.

For this to work the central service needs to be agile and responsive, ideally with internal Service Level Agreements to keep everyone accountable. Projects request encryption support; the team managing the central service determines the best way to integrate, and to meet security and compliance requirements; then they provide access and technical support to make it happen.

This enables you to consolidate and better manage key management tools, while maintaining security and compliance requirements such as audit and separation of duties. Whatever tool(s) you select clearly need to support your various distributed requirements. The last thing you want to do is centralize but establish processes, tools, and requirements that interfere with projects meeting their own goals.

And don’t focus so exclusively on new projects and technologies that you forget about what’s already in place. Our advice isn’t merely for projects based on microservices containers, and the cloud – it applies equally for backup tapes and SAN encryption.

Centralize but disperse, and support distributed needs

Once you establish a centralized service you need to support distributed access. There are two primary approaches, but we only recommend one for most organizations:

  • Allow access from anywhere. In this model you position the key manager in a location accessible from wherever it might be needed. Typically organizations select this option when they want to only maintain a single key manager (or cluster). It was common in traditional data centers, but isn’t well-suited for the kinds of situations we increasingly see today.
  • Distributed architecture. In this model you maintain a core “root of trust” key manager (which can, again, be a cluster), but then you position distributed key managers which tie back to the central service. These can be a mix of physical and virtual appliances or servers. Typically they only hold the keys for the local application, device, etc. that needs them (especially when using virtual appliances or software on a shared service). Rather than connecting back to complete every key operation, the local key manager handles those while synchronizing keys and configuration back to the central root of trust.

Why distribute key managers which still need a connection back home? Because they enable you to support greater local administrative control and meet local performance requirements. This architecture also keeps applications and services up and running in case of a network outage or other problem accessing the central service. This model provides an excellent balance between security and performance.

For example you could support a virtual appliance in a cloud project, physical appliances in backup data centers, and backup keys used within your cloud provider with their built-in encryption service.

This way you can also support different technologies for distributed projects. The local key manager doesn’t necessarily need to be the exact same product as the central one, so long as they can communicate and both meet your security and compliance requirements. We have seen architectures where the central service is a cluster of Hardware Security Modules (appliances with key management features) supporting a distributed set of HSMs, virtual appliances, and even custom software.

The biggest potential obstacle is providing safe, secure access back to the core. Architecturally you can usually manage this with some bastion systems to support key exchange, without opening the core to the Internet. There may still be use cases where you cannot tie everything together, but that should be your last option.

Be flexible: use the right tool for the right job

Building on our previous recommendation, you don’t need to force every project to use a single tool. One of the great things about key management is that modern systems support a number of standards for intercommunication. And when you get down to it, an encryption key is merely a chunk of text – not even a very large one.

With encryption systems, keys and the encryption engine don’t need to be the same product. Even your remote key manager doesn’t need to be the same as the central service if you need something different for that particular project.

We have seen large encryption projects fail because they tried to shoehorn everything into a single monolithic stack. You can increase your chances for success by allowing some flexibility in remote tools, so long as they meet your security requirements. This is especially true for the encryption engines that perform actual crypto operations.

Provide APIs, SDKs, and toolkits

Even off-the-shelf encryption engines sometimes ship with less than ideal defaults, and can easily be used incorrectly. Building a key management service isn’t merely creating a central key manager – you also need to provide hooks to support projects, along with processes and guidance to ensure they are able to get up and running quickly and securely.

  • Application Programming Interfaces: Most key management tools already support APIs, and this should be a selection requirement. Make sure you support RESTful APIs, which are particularly ubiquitous in the cloud and containers. SOAP APIs are considered burdensome these days.
  • Software Development Kits: SDKs are pre-built code modules that allow rapid integration into custom applications. Provide SDKs for common programming languages compatible with your key management service/products. If possible you can even pre-configure them to meet your encryption requirements and integrate with your service.
  • Toolkits: A toolkit includes all the technical pieces a team needs to get started. It can include SDKs, preconfigured software agents, configuration files, and anything else a project might need to integrate encryption into anyything from a new application to an old tape backup system.

Provide templates and recommendations, not just standards and requirements

All too often security sends out requirements, but fails to provide specific instructions for meeting those requirements. One of the advantages of standardization around a smaller set of tools is that you can provide detailed recommendations, instructions, and templates to satisfy requirements.

The more detail you can provide the better. We recommend literally creating instructional documents for how to use all approved tools, likely with screenshots, to meet encryption needs and integrate with your key management service. Make them easily available, perhaps through code repositories to better support application developers. On the operations side, include them not only for programming and APIs, but for software agents and integration into supported storage repositories and backup systems.

If a project comes up which doesn’t fit any existing toolkit or recommendations, build them with that project team and add the new guidance to your central repository. This dramatically speeds up encryption initiatives for existing and new platforms.

Meet core security requirements

So far we have focused on newer requirements to meet evolving data center architectures, the impact of the cloud, and new application design patterns; but all the old key management practices still apply:

  • Enforce separation of duties: Implement multiple levels of administrators. Ideally require dual authorities for operations directly impacting key security and other major administrative functions.
  • Support key rotation: Ideally key rotation shouldn’t create downtime. This typically requires both support in the key manager and configuration within encryption engines and agents.
  • Enable usage logs for audit, including purpose codes: Logs may be required for compliance, but are also key for security. Purpose codes tell you why a key was requested, not just by who or when.
  • Support standards: Whatever you use for key management must support both major encryption standards and key exchange/management standards. Don’t rely on fully proprietary systems that will overly limit your choices.
  • Understand the role of FIPS and its different flavors, and ensure you meet your requirements: FIPS 140-2 is the most commonly accepted standard for cryptographic modules and systems. Many products advertise FIPS compliance (which is often a requirement for other compliance, such as PCI). But FIPS is a graded standard with different levels ranging from a software module, to plugin cards, to a fully tamper-resistant dedicated appliance. Understand your FIPS requirements, and if you evaluate a “FIPS certified” ‘appliance’, don’t assume the entire appliance is certified – it might be only the software, not the whole system. You may not always need the highest level of assurance, but start by understanding your requirements, and then ensure your tool actually meets them.

There are many more technical best practices beyond the scope of this research, but the core advice that might differ from what you have seen in the past is:

  • Provide key management as a service to meet diverse encryption needs.
  • Be able to support distributed architectures and a range of use cases.
  • Be flexible on tool choice, then provide technical components and clear guidance on how to properly use tools and integrate them into your key management program.
  • Don’t neglect core security requirements.

In our next section we will start looking at specific use cases, some of which we have already hinted at.


Summary: June 3, 2016

By Adrian Lane

Adrian here.

Unlike my business partners, who have been logging thousands of air miles, speaking at conferences and with clients around the country, I have been at home. And the mildest spring in Phoenix’s recorded history has been a blessing, as we’re 45 days past the point 100F days typically start. Bike rides. Hiking. Running. That is, when I get a chance to sneak outdoors and enjoy it. With our pivot there is even more writing and research going on than normal, which I wasn’t sure was possible. You will begin to see the results of this work within the next couple weeks, and we look forward to putting a fresh face on our business. That launch will coincide with us posting lots more hands-on advice for cloud security and migrations.

And as a heads-up, I will be talking big data security over at SC Magazine on the 20th. I’ll tweet out a link at @AdrianLane next week if you’re interested.

You can subscribe to only the Friday Summary.

Top Posts for the Week

Tool of the Week

“Server-less computing? What do you mean?” Rich and I were discussing cloud deployment options with one of the smartest engineering managers I know, and he was totally unaware of serverless cloud computing architectures. If he was unaware of this capability, lots of other people probably are as well. So this week’s Tool of the Week section will discuss not a single tool, but instead a functional paradigm offered by multiple cloud vendors. What are they? Google’s GCP page best captures the idea: essentially a “lightweight, event-based, asynchronous solution that allows you to create small, single-purpose functions that respond to Cloud events without the need to manage a server or a runtime environment” What Google does not mention there is that these functions tend to be very fast, and you can run multiple copies in parallel to scale capacity.

It really embodies microservices. You can construct an entire application from these functions. For example take a stream of data and run it through a series of functions to process it. It could be audio or image files’ or real-time event data inspection, transformation, enrichment, comparison… or any combination you can think of. The best part? There is no server. There is no OS to set up. No CPU or disk capacity to specify. No configuration files. No network ports to manage. It’s simply a logical function running out there in the ‘ether’ of your public cloud.

Google calls its version on GCP Cloud Functions. Amazon’s version on AWS is called (Lambda functions](http://docs.aws.amazon.com/lambda/latest/dg/welcome.html). Microsoft calls the version on Azure simply Functions. Check out their API documentation – they all work slightly differently, and some have specific storage requirements to act as endpoints, but the concept is the same. And the pricing for these services is pretty low – with Lambda, for example, the first million requests are free, and Amazon charges 20 cents per million requests after that.

This feature is one of the many reasons we tell companies to reconsider their application architectures when moving to cloud services. We’ll post some tidbits on security for these services in the future. For now, check them out!

Securosis Blog Posts this Week

Training and Events

—Adrian Lane

Thursday, June 02, 2016

Incident Response in the Cloud Age: In Action

By Mike Rothman

When we do a process-centric research project, it works best to wrap up with a scenario to illuminate the concepts we discuss through the series, and make things a bit more tangible.

In this situation imagine you work for a mid-sized retailer which uses a mixture of in-house technology and SaaS, and has recently moved a key warehousing system to an IaaS provider as part of rebuilding the application for cloud computing. You have a modest security team of 10, which is not enough, but a bit more than many of your peers. Senior management understands why security is important (to a point) and gives you decent leeway, especially regarding the new IaaS application. In fact you were consulted during the IaaS architecture phase and provided some guidance (with some help from your friends at Securosis) on building a resilient cloud network architecture, and how to secure the cloud control plane. You also had an opportunity to integrate some orchestration and automation technology into the new cloud technology stack.

The Trigger

You have your team on fairly high alert, because a number of your competitors have recently been targeted by an organized crime ring which has gained a foothold among your competitors; and proceeded to steal a ton of information about customers, pricing, and merchandising strategies. This isn’t your first rodeo, so you know that when there is smoke there is usually fire, and you decide to task one of your more talented security admins with a little proactive hunting in your environment. Just to make sure nothing bad is going on.

The admin starts poking around, searching internal security data with some of the more recent malware samples found in the attacks on the other retailers. The samples were provided by your industry’s ISAC (Information Sharing and Analysis Center). The analyst got a hit on one of the samples, confirming your concern. You have an active adversary on your network. So now you need to engage your incident response process.

Job 1: Initial Triage

Once you know there is a situation you assemble the response team. There aren’t that many of you, and half the team needs to pay attention to ongoing operational tasks, because taking down systems wouldn’t make you popular with senior management or investors. You also don’t want to jump the gun until you know what you’re dealing with, so you inform the senior team of the situation, but don’t take any systems down. Yet.

The adversary is active on your internal network, so they most likely entered via phishing or another social engineering attack. Searches found indications of the malware on 5 devices, so you take those devices off the network immediately. Not shut down, but put on a separate network with Internet access to avoid tipping off the adversary to their discovery.

Then you check your network forensics tool, looking for indications that data has been leaking. There are a few suspicious file transfers, but luckily you integrated your firewall egress filtering capability with your forensics tool. So once the firewall showed anomalous traffic being sent to known bad sites (via a threat intelligence integration on the firewall), you automatically started capturing network traffic from the devices which triggered the alert. Automation is sure easier that doing everything manually.

As part of your initial triage you got endpoint telemetry alerting you to issues, and network forensics data for a clue to what’s leaking. This is enough to know you not only have an active adversary, but that more than likely you lost data. So you fire up your case management system to structure your investigation and store all the artifacts of your investigation.

Your team is tasked with specific responsibilities, and sent on their way to get things done. You make the trek to the executive floor to keep senior management updated on the incident.

Check the Cloud

The attack seems to have started on your internal network, but you don’t want to take chances, and you need to make sure the new cloud-based application isn’t at risk. A quick check of the cloud console shows strange activity on one of your instances. A device within the presentation layer of the cloud stack was flagged by your IaaS provider’s monitoring system because there was an unauthorized change on that specific instance. It looks like the time you spent setting up that configuration monitoring service was well spent.

Security was involved in architecting the cloud stack, so you are in good shape. The application was built to be isolated. Even though it appears the presentation layer has been compromised, adversaries shouldn’t be able to reach anything of value. And the clean-up has already happened. Once the IaaS monitoring system threw an alert, that instance was taken offline and put into a special security group accessible only by investigators. A forensic server was spun up, and some additional analysis was performed. Orchestration and automation facilitating incident response again.

The presentation layer has large variances in how much traffic it needs to handle, so it was built using auto-scaling technology and immutable servers. Once the (potentially) compromised instance was removed from the group, another instance with a clean configuration was spun up and to share workload. But it’s not clear whether this attack is related to the other incident, so you take the information about the cloud attack, pull it down, and feed it into your case management system. But the reality is that this attack, even if related, doesn’t present a danger at this point, so it’s put to the side while you focus on the internal attack and probable exfiltration.

Building the Timeline

Now that you have completed initial triage, it’s time to dig into the attack and start building a timeline of what happened. You start by looking at the comprised endpoints and network metadata to see what the adversaries did. From examining endpoint telemetry you deduced that Patient Zero was a contractor on the Human Resources (HR) team. This individual was tasked with looking at resumes submitted to the main HR email account, and initial qualification screening for an open position. The resume was a compromised Word file using a pretty old Windows 7 attack. It turns out the contractor was using their own machine, which hadn’t been patched and was vulnerable. You can’t be that irritated with the contractor – it was their job to open those files. The malware rooted the device, connected up to a botnet, and then installed a Remote Access Trojan (RAT) to allow the adversary to take control of the device and start a systematic attack against the rest of your infrastructure.

You ponder how your organization’s BYOD policy enables contractors to use their own machines. The operational process failure was in not inspecting the machine on connection to the network; you didn’t make sure it was patched, or running an authorized configuration. That’s something to scrutinize as part of the post-mortem.

Once the adversary had presence on your network, they proceeded to compromise another 4 devices, ultimately ending up on both the CFO’s and the VP of Merchandising’s devices. Network forensic metadata shows how they moved laterally within the network, taking advantage of weak segmentation between internal networks. There are only so many hours in the day, and the focus had been on making sure the perimeter was strong and monitoring ingress traffic.

Once you know the CFO’s and VP of Merchandising’s devices were compromised, you can clearly see exfiltration in network metadata. A quick comparison of file sizes in data captured once the egress filter triggered shows that they probably got the latest quarterly board report, as well as a package of merchandising comps and plans for an exclusive launch with a very hot new fashion company. It was a bit of a surprise that the adversary didn’t bother encrypting the stolen data, but evidently they bet that a mid-sized retailer wouldn’t have sophisticated DLP or egress content filtering. Maybe they just didn’t care whether anyone found out what was exfiltrated after the fact, or perhaps they were in a hurry and wanted the data more than to remain undiscovered.

You pat yourself on the back, once, that your mature security program included an egress filter triggered a full packet capture of outbound traffic from all the compromised devices. So you know exactly what was taken, when, and where it went. That will be useful later, when talking to law enforcement and possibly prosecuting at some point, but right now that’s little consolation.

Cleaning up the Mess

Now that you have an incident timeline, it’s time to clean up and return your environment to a good state. The first step is to clean up the affected machines. Executives are cranky because you decided to reimage their machines, but your adversary worked to maintain persistence on compromised devices in other attacks, so prudence demands you wipe them.

The information on this incident will need to be aggregated, then packaged up for law enforcement and the general counsel, in preparation for the unavoidable public disclosure. You take another note that the team should consider using a case management system to track incident activity, provide a place to store case artifacts, and ensure proper chain of custody. Given your smaller team, that should help smooth your next incident response.

Finally, this incident was discovered by a savvy admin hunting across your networks. So to complete the active part of this investigation, you task the same admin with hunting back through the environment to make sure this attack has been fully eradicated, and no similar attacks are in process. Given the size of your team, it’s a significant choice to devote resources to hunting, but given the results, this is an activity you will need to perform on a monthly cadence.

Closing the Loop

To finalize this incident, you hold a post-mortem with the extended team, including representatives from the general counsel’s office. The threat intelligence being used needs to be revisited and scrutinized, because the adversary connected to a botnet but wasn’t detected. And the rules on your egress filters have been tightened because if the exfiltrated data had been encrypted, your response would have been much more complicated. The post-mortem also provided a great opportunity to reinforce the importance of having security involved in application architecture, given how well the new IaaS application stood up under attack.

Another reminder that sometimes a skilled admin who can follow their instincts is the best defense. Tools in place helped accelerate response and root cause identification, and made remediation more effective. But Incident Response in the Cloud Age involves both people and technology, along with internal and external data, to ensure effective and efficient investigation and successful remediation.

—Mike Rothman