Login  |  Register  |  Contact
Wednesday, October 07, 2015

Building a Threat Intelligence Program: Using TI

By Mike Rothman

As we dive back into the Threat Intelligence Program, we have summarized why a TI program is important and how to (gather intelligence. Now we need a programmatic approach for using TI to improve your security posture and accelerate your response & investigation functions.

To reiterate (because it has been a few weeks since the last post), TI allows you to benefit from the misfortune of others, meaning it’s likely that other organizations will get hit with attacks before you, so you should learn from their experience. Like the old quote, “Wise men learn from their mistakes, but wiser men learn from the mistakes of others.” But knowing what’s happened to others isn’t enough. You must be able to use TI in your security program to gain any benefit.

First things first. We have plenty of security data available today. So the first step in your program is to gather the appropriate security data to address your use case. That means taking a strategic view of your data collection process, both internally (collecting your data) and externally (aggregating threat intelligence). As described in our last post, you need to define your requirements (use cases, adversaries, alerting or blocking, integrating with monitors/controls, automation, etc.), select the best sources, and then budget for access to the data.

This post will focus on using threat intelligence. First we will discuss how to aggregate TI, then on using it to solve key use cases, and finally on tuning your ongoing TI gathering process to get maximum value from the TI you collect.

Aggregating TI

When aggregating threat intelligence the first decision is where to put the data. You need it somewhere it can be integrated with your key controls and monitors, and provide some level of security and reliability. Even better if you can gather metrics regarding which data sources are the most useful, so you can optimize your spending. Start by asking some key questions:

  • To platform or not to platform? Do you need a standalone platform or can you leverage an existing tool like a SIEM? Of course it depends on your use cases, and the amount of manipulation & analysis you need to perform on your TI to make it useful.
  • Should you use your provider’s portal? Each TI provider offers a portal you can use to get alerts, manipulate data, etc. Will it be good enough to solve your problems? Do you have an issue with some of your data residing in a TI vendor’s cloud? Or do you need the data to be pumped into your own systems, and how will that happen?
  • How will you integrate the data into your systems? If you do need to leverage your own systems, how will the TI get there? Are you depending on a standard format like STIX/TAXXI? Do you expect out-of-the-box integrations?

Obviously these questions are pretty high-level, and you’ll probably need a couple dozen follow-ups to fully understand the situation.

Selecting the Platform

In a nutshell, if you have a dedicated team to evaluate and leverage TI, have multiple monitoring and/or enforcement points, or want more flexibility in how broadly you use TI, you should probably consider a separate intelligence platform or ‘clearinghouse’ to manage TI feeds. Assuming that’s the case, here are a few key selection criteria to consider when selecting a stand-alone threat intelligence platforms:

  1. Open: The TI platform’s task is to aggregate information, so it must be easy to get information into it. Intelligence feeds are typically just data (often XML), and increasingly distributed in industry-standard formats such as STIX, which make integration relatively straightforward. But make sure any platform you select will support the data feeds you need. Be sure you can use the data that’s important to you, and not be restricted by your platform.
  2. Scalable: You will use a lot of data in your threat intelligence process, so scalability is essential. But computational scalability is likely more important than storage scalability – you will be intensively searching and mining aggregated data, so you need robust indexing. Unfortunately scalability is hard to test in a lab, so ensure your proof of concept testbed is a close match for your production environment, and that you can extrapolate how the platform will scale in your production environment.
  3. Search: Threat intelligence, like the rest of security, doesn’t lend itself to absolute answers. So make TI the beginning of your process of figuring out what happened in your environment, and leverage the data for your key use cases as we described earlier. One clear requirement for all use cases is search. Be sure your platform makes searching all your TI data sources easy.
  4. Scoring: Using Threat Intelligence is all about betting on which attackers, attacks, and assets are most important to worry about, so a flexible scoring mechanism offers considerable value. Scoring factors should include assets, intelligence sources, and attacks, so you can calculate a useful urgency score. It might be as simple as red/yellow/green, depending on the sophistication of your security program.

Key Use Cases

Our previous research has focused on how to address these key use cases, including preventative controls (FW/IPS), security monitoring, and incident response. But a programmatic view requires expanding the general concepts around use cases into a repeatable structure, to ensure ongoing efficiency and effectiveness.

The general process to integrate TI into your use cases is consistent, with some variations we will discuss below under specific use cases.

  1. Integrate: The first step is to integrate the TI into the tools for each use case, which could be security devices or monitors. That may involve leveraging the management consoles of the tools to pull in the data and apply the controls. For simple TI sources such as IP reputation, this direct approach works well. For more complicated data sources you’ll want to perform some aggregation and analysis on the TI before updating rules running on the tools. In that case you’ll expect your TI platform for integrate with the tools.
  2. Test and Trust: The key concept here is trustable automation. You want to make sure any rule changes driven by TI go through a testing process before being deployed for real. That involves monitor mode on your devices, and ensuring changes won’t cause excessive false positives or take down any networks (in the case of preventative controls). Given the general resistance of many network operational folks to automation, it may be a while before everyone trusts automatic changes, so factor that into your project planning.
  3. Tuning via Feedback: In our dynamic world the rules that work today and the TI that is useful now will need to evolve. So you’ll constantly be tuning your TI and rulesets to optimize for effectiveness and efficiency. You are never done, and will constantly need to tune and assess new TI sources to ensure your defenses stay current.

From a programmatic standpoint, you can look back to our Applied Threat Intelligence research for granular process maps for integrating threat intelligence with each use case in more detail.

Preventative Controls

The idea when using TI within a preventative control is to use external data to identify what to look for before it impacts your environment. By ‘preventative’ we mean any control that is inline and can prevent attacks, not just alert. These include:

  • Network security devices: This category encompasses firewalls (including next-generation models) and Intrusion Prevention Systems. But you might also include devices such as web application firewalls, which operate at different levels in the stack but are inline and can block attacks.
  • Content security devices/services: Web and email filters can also function as preventative controls because they inspect traffic as it passes through, and can enforce policies to block attacks.
    • Endpoint security technologies: Protecting an endpoint is a broad mandate, and can include traditional endpoint protection (anti-malware) and newfangled advanced endpoint protection technologies such as isolation and advanced heuristics.

We want to use TI to block recognized attacks, but not crater your environment with false positives, or adversely impact availability.

So the greatest sensitivity, and the longest period of test and trust, will be for preventative controls. You only get one opportunity to take down your network with an automated TI-driven rule set, so make sure you are ready before you deploy blocking rules operationally.

Security Monitoring

Our next case uses Threat Intelligence to make security monitoring more effective. As we’ve written countless times, security monitoring is necessary because you simply cannot prevent everything, so you need to get better and faster at responding. Improving detection is critical to effectively shortening the window between compromise and discovery.

Why is this better than just looking for well-established attack patterns like privilege escalation or reconnaissance, as we learned in SIEM school? The simple answer is that TI data represents attacks happening right now on other networks. Attacks you otherwise wouldn’t see or know to look for until too late. In a security monitoring context leveraging TI enables you to focus your validation/triage efforts, detect faster and more effectively, and ultimately make better use of scarce resources which need to be directed at the most important current risk.

  • Aggregate Security Data: The foundation for any security monitoring process is internal security data. So before you can worry about external threat intel, you need to enumerate devices to monitor in your environment, scope out the kinds of data you will get from them, and define collection policies and correlation rules. Once this data is available in a repository for flexible, fast, and efficient search and analysis you are ready to start integrating external data.
  • Security Analytics: Once the TI is integrated, you let the advanced math of your analytics engine do its magic, correlating and alerting on situations that warrant triage and possibly deeper investigation.
  • Action/Escalation: Once you have an alert, and have gathered data about the device and attack, you need to determine whether the device was actually compromised or the alert was a false positive. Once you verify an attack you’ll have a lot of data to send to the next level of escalation – typically an incident response process.

The margin for error is a bit larger when integrating TI into a monitoring context than a preventative control, but you still don’t want to generate a ton of false positives and have operational folks running around chasing then. Testing and tuning processes remain critical (are you getting the point yet?) to ensure that TI provides sustainable benefit instead of just creating more work.

Incident Response

Similar to the way threat intelligence helps with security monitoring, you can use TI to focus investigations on the devices most likely to be impacted, help identify adversaries, and lay out their tactics to streamline your response. Just to revisit the general steps of an investigation, here’s a high-level view of incident response:

  • Phase 1: Current Assessment: This involves triggering your process and escalating to the response team, then triaging the situation to figure out what’s really at risk. A deeper analysis follows to prove or disprove your initial assessment and figure out whether it’s a small issue or a raging fire.
  • Phase 2: Investigate: Once the response process is fully engaged you need to get the impacted devices out of harm’s way by quarantining them and taking forensically clean images for chain of custody. Then you can start to investigate the attack more deeply to understand your adversary’s tactics, build a timeline of the attack, and figure out what happened and what was lost.
  • Phase 3: Mitigation and Clean-up: Once you have completed your investigation you can determine the appropriate mitigations to eradicate the adversary from your environment and clean up the impacted parts of the network. The goal is to return to normal business operations as quickly as possible. Finally you’ll want a post-mortem after the incident is taken care of, to learn from the issues and make sure they don’t happen again.

The same concepts apply as in the other use cases. You’ll want to integrate the TI into your response process, typically looking to match indicators and tactics against specific adversaries to understand their motives, profile their activities, and get a feel for what is likely to come next. This helps to understand the level of mitigation necessary, and determine whether you need to involve law enforcement.

Optimizing TI Spending

The final aspect of the program for today’s discussion is the need to optimize which data sources you use – especially the ones you pay for. Your system should be tuned to normalize and reduce redundant events, so you’ll need a process to evaluate the usefulness of your TI feeds. Obviously you should avoid overlap when buying feeds, so understand how each intelligence vendor gets their data. Do they use honeypots? Do they mine DNS traffic and track new domain registrations? Have they built a cloud-based malware analysis/sandboxing capability? Categorize vendors by their tactics to help pick the best fit for your requirements.

Once the initial data sources are integrated into your platform and/or controls you’ll want to start tracking effectiveness. How many alerts are generated by each source? Are they legitimate? The key here is the ability to track this data, and if these capabilities are not built into the platform you are using, you’ll need to manually instrument the system to extract this kind of data. Sizable organizations invest substantially in TI data, and you want to make sure you get a suitable return on that investment.

At this point you have a systematic program in place to address your key use cases with threat intelligence. But taking your TI program to the next level requires you to think outside your contained world. That means becoming part of a community to increase the velocity of your feedback loop, and be a contributor to the TI ecosystem rather than just a taker. So our next post will focus on how you can securely share what you’ve learned through your program to help others.

—Mike Rothman

Tuesday, October 06, 2015

Building Security Into DevOps: Tools and Testing in Detail

By Adrian Lane

Thus far I’ve been making the claim that security can be woven into the very fabric of your DevOps framework; now it’s time to show exactly how. DevOps encourages testing at all phases in the process, and the earlier the better. From the developers desktop prior to check-in, to module testing, and against a full application stack, both pre and post deployment - it’s all available to you.

Where to test

  • Unit testing: Unit testing is nothing more than running tests again small sub-components or fragments of an application. These tests are written by the programmer as they develop new functions, and commonly run by the developer prior to code checkin. However, these tests are intended to be long-lived, checked into source repository along with new code, and run by any subsequent developers who contribute to that code module. For security, these be straightforward tests – such as SQL Injection against a web form – to more complex attacks specific to the function, such as logic attacks to ensure the new bit of code correctly reacts to a users intent. Regardless of the test intent, unit tests are focused on specific pieces of code, and not systemic or transactional in nature. And they intended to catch errors very early in the process, following the Deming ideal that the earlier flaws are identified, the less expensive they are to fix. In building out your unit tests, you’ll need to both support developer infrastructure to harness these tests, but also encourage the team culturally to take these tests seriously enough to build good tests. Having multiple team member contribute to the same code, each writing unit tests, helps identify weaknesses the other did not not consider.
  • Security Regression tests: A regression test is one which validates recently changed code still functions as intended. In a security context this it is particularly important to ensure that previously fixed vulnerabilities remain fixed. For DevOps regression tests can are commonly run in parallel to functional tests – which means after the code stack is built out – but in a dedicated environment security testing can be destructive and cause unwanted side-effects. Virtualization and cloud infrastructure are leveraged to aid quick start-up of new test environments. The tests themselves are a combination of home-built test cases, created to exploit previously discovered vulnerabilities, and supplemented by commercial testing tools available via API for easy integration. Automated vulnerability scanners and dynamic code scanners are a couple of examples.
  • Production Runtime testing: As we mentioned in the Deployment section of the last post, many organizations are taking advantage of blue-green deployments to run tests of all types against new production code. While the old code continues to serves user requests, new code is available only to select users or test harnesses. The idea is the tests represent a real production environment, but the automated environment makes this far easier to set up, and easier to roll back in the event of errors.
  • Other: Balancing thoroughness and timelines is a battle for most organization. The goal is to test and deploy quickly, with many organizations who embrace CD releasing new code a minimum of 10 times a day. Both the quality and depth of testing becomes more important issue: If you’ve massaged your CD pipeline to deliver every hour, but it takes a week for static or dynamic scans, how do you incorporate these tests? It’s for this reason that some organizations do not do automated releases, rather wrap releases into a ‘sprint’, running a complete testing cycle against the results of the last development sprint. Still others take periodic snap-shops of the code and run white box tests in parallel, but do gate release on the results, choosing to address findings with new task cards. Another way to look at this problem, just like all of your Dev and Ops processes will go through iterative and continual improvement, what constitutes ‘done’ in regards to security testing prior to release will need continual adjustment as well. You may add more unit and regression tests over time, and more of the load gets shifted onto developers before they check code in.

Building a Tool Chain

The following is a list of commonly used security testing techniques, the value they provide, and where they fit into a DevOps process. Many of you reading this will already understand the value of tools, but perhaps not how they fit within a DevOps framework, so we will contrast traditional vs. DevOps deployments. Odds are you will use many, if not all, of these approaches; breadth of testing helps thoroughly identify weaknesses in the code, and better understand if the issues are genuine threats to application security.

  • Static analysis: Static Application Security Testing (SAST) examine all code – or runtime binaries – providing a thorough examination for common vulnerabilities. These tools are highly effective at finding flaws, often within code that has been reviewed manually. Most of the platforms have gotten much better at providing analysis that is meaningful to developers, not just security geeks. And many are updating their products to offer full functionality via APIs or build scripts. If you can, you’ll want to select tools that don’t require ‘code complete’ or fail to offer APIs for integration into the DevOps process. Also note we’ve seen a slight reduction in use as these tests often take hours or days to run; in a DevOps environment that may eliminate in line tests as a gate to certification or deployment. Most organizations, as we mentioned in the above section labelled ‘Other’, teams are adjusting to out of band testing with static analysis scanners. We highly recommend keeping SAST testing as part of the process and, if possible, are focused on new sections of code only to reduce the duration of the scan.
  • Dynamic analysis: Dynamic Application Security Testing (DAST), rather than scan code or binaries as SAST tools above, dynamically ‘crawl’ through an application’s interface, testing how the application reacts to inputs. While these scanners do not see what’s going on behind the scenes, they do offer a very real look at how code behaves, and can flush out errors in dynamic code paths that other tests may not see. These tests are typically run against fully built applications, and as these tests can be destructive, the tools often have settings to allow you to be more aggressive tests to be run in test environments.
  • Fuzzing: In the simplest definition, fuzz testing is essentially throwing lots of random garbage at applications, seeing if any specific type of garbage causes an application to error. Go to any security conference – BlackHat, Defcon, RSA or B-Sides – and the approach used by most security researchers to find vulnerable areas of code is fuzzing. Make no mistake, it’s key to identifying misbehaving code that may offer some exploitable weaknesses. Over the last 10 years, with Agile development processes and even more with DevOps, we are seeing a steady decline by development and QA teams in the use of fuzz testing. It’s because, to run through a large test body of possible malicious inputs, it takes a lot of time. This is a little less of an issue with web applications as attackers don’t have copies of the code, but much more problematic for applications delivered to users (e.g.: mobile apps, desktop applications, automobiles). The disparity of this change is alarming, and like pen testing, fuzz testing should be a periodic part of your security testing efforts. It can even be performed as a unit tests, or as component testing, in parallel to your normal QA efforts.
  • Manual code review: Sure, some organizations find it more than a little scary to fully automate deployments, and they want a human to review changes before new code goes live; that’s understandable. But there are very good security reasons for doing it as well. In an environment as automation-centric as DevOps, it may seem antithetical to use or endorse manual code reviews or security inspection, but it is still a highly desirable addition. Manual reviews often catch obvious stuff that the tests miss, or a developer will miss on first pass. What’s more, not all developers are created equal in their ability to write security unit tests. Either through error or skill, people writing the tests miss stuff which manual inspections catch. Manual code inspections, at least period spot checks in new code, are something you’ll want to add to your repertoire.
  • Vulnerability analysis: Some people equate vulnerability testing with DAST, but they can be different. Things like Heartbleed, misconfigured databases or Structs vulnerabilities may not be part of your application testing at all, but a critical vulnerability within your application stack. Some organizations scan application servers for vulnerabilities, typically as a credentialed user, looking for un-patched software. Some have pen testers probe for issues with their applications, looking for for weaknesses in configuration and places where security controls were not applied.
  • Version controls: One of the nice side benefits of having build scripts serve both QA and production infrastructure is that Dev, Ops and QA are all in synch on the versions of code that they use. Still, someone on your team needs to monitor and provide version controls and updates to all parts of the application stack. For example, are those gem files up to date? As with vulnerability scanning above, you want the open source and commercial software you use should be monitored for new vulnerabilities, and task cards created to introduce patches into the build process. But many of the vulnerability analysis products don’t cover all of the bits and pieces that compose an application. This can be fully automated in house, having build scripts adjusted to pull the latest version, or you can integrate third party tools to do the monitoring and alerting. Either way, version control should now be part of your overall security monitoring program, with or without vulnerability analysis mentioned above.
  • Runtime Protection: This is a new segment of the application security market. While the technical approaches are not new, over the past couple of years we’ve seen greater adoption of some run-time security tools that embed into applications for runtime threat protection. The names of the tools vary (real-time application scanning technologies (RAST), execution path monitoring, embedded application white listing) as do the deployment models (embedded runtime libraries, in-memory execution monitoring, virtualized execution paths), but they share the common goal of protecting applications by looking for attacks in runtime behavior. All of these platforms can be embedded into the build or runtime environment, all can monitor or block, and all adjust enforcement based upon the specifics of the application.


Integrating security findings from application scans into bug tracking systems is technically not that difficult. Most products have that as a built in feature. Actually figuring out what to do with that data once it’s obtained is. For any security vulnerability discovered, is it really a risk? If it is a risk and not a false positive, what is the priority relative to everything else that is going on? How is the information distributed? Now, with DevOps, you’ll need to close the loop on issues within the infrastructure as well as the code. And since Dev and Ops both offer potential solutions to most vulnerabilities, the people who manage security tasks need to ensure they include operations teams as well. Patching, code changes, blocking, functional white listing are potential methods to close security gaps, and as such, you’ll need both Dev and Ops to weigh the tradeoffs.

In the next post I am going back to the role of security within DevOps. And I will also be going back to pretty much all of the initial posts in this series as I have noted omissions that need to be rectified, and areas that I’ve failed to explain clearly. As always, comments and critique welcome!

—Adrian Lane

Monday, October 05, 2015

New Report: Pragmatic Security for Cloud and Hybrid Networks

By Rich

This is one of those papers I’ve been wanting to write for a while. When I’m out working with clients, or teaching classes, we end up spending a ton of time on just how different networking is in the cloud, and how to manage it. On the surface we still see things like subnets and routing tables, but now everything is wired together in software, with layers of abstraction meant to look the same, but not really work the same.

This paper covers the basics and even includes some sample diagrams for Microsoft Azure and Amazon Web Services, although the bulk of the paper is cloud-agnostic.

From the report:

Over the last few decades we have been refining our approach to network security. Find the boxes, find the wires connecting them, drop a few security boxes between them in the right spots, and move on. Sure, we continue to advance the state of the art in exactly what those security boxes do, and we constantly improve how we design networks and plug everything together, but overall change has been incremental. How we think about network security doesn’t change – just some of the particulars.

Until you move to the cloud.

While many of the fundamentals still apply, cloud computing releases us from the physical limitations of those boxes and wires by fully abstracting the network from the underlying resources. We move into entirely virtual networks, controlled by software and APIs, with very different rules. Things may look the same on the surface, but dig a little deeper and you quickly realize that network security for cloud computing requires a different mindset, different tools, and new fundamentals. Many of which change every time you switch cloud providers.

Special thanks to Algosec for licensing the research. As usual everything was written completely independently using our Totally Transparent Research process. It’s only due to these licenses that we are able to give this research away for free.

The landing page for the paper is here. Direct download: Pragmatic Security for Cloud and Hybrid Networks (pdf)


Saturday, October 03, 2015

Building Security Into DevOps: Security Integration Points

By Adrian Lane

A couple housekeeping items before I begin today’s post - we’ve had a couple issues with the site so I apologize if you’ve tried to leave comments but could not. We think we have that fixed. Ping us if you have trouble.

Also, I am very happy to announce that Veracode has asked to license this research series on integrating security into DevOps! We are very happy to have them onboard for this one. And it’s support from the community and industry that allows us to bring you this type of research, and all for free and without registration.

For the sake of continuity I’ve decided to swap the order of posts from our original outline. Rather than discuss the role of security folks in a DevOps team, I am going to examine integration of security into code delivery processes. I think it will make more sense, especially for those new to DevOps, to understand the technical flow and how things fit together before getting a handle on their role.

The Basics

Remember that DevOps is about joining Development and Operations to provide business value. The mechanics of this are incredibly important as it helps explain how the two teams work together, and that is what I am going to cover today.

Most of you reading this will be familiar with the concept of ‘nightly builds’, where all code checked in the previous day would be compiled overnight. And you’re just as familiar with the morning ritual of sipping coffee while you read through the logs to see if the build failed, and why. Most development teams have been doing this for a decade or more. The automated build is the first of many steps that companies go through on their way towards full automation of the processes that support code development. The path to DevOps is typically done in two phases: First with continuous integration, which manages the building an testing of code, and then continuous deployment, which assembles the entire application stack into an executable environment.

Continuous Integration

The essence of Continuous Integration (CI) is where developers check in small iterative advancements to code on a regular basis. For most teams this will involve many updates to the shared source code repository, and one or more ‘builds’ each day. The core idea is smaller, simpler additions where we can more easily - and more often - find defects in the code. Essentially these are Agile concepts, but implemented in processes that drive code instead of processes that drive people (e.g.: scrums, sprints). Definition of CI has morphed slightly over the last decade, but in context to DevOps, CI also implies that code is not only built and integrated with supporting libraries, but also automatically dispatched for testing as well. And finally CI in a DevOps context also implies that code modifications will not be applied to a branch, but into the main body of the code, reducing complexity and integration nightmares that plague development teams.

Conceptually, this sounds simple, but in practice it requires a lot of supporting infrastructure. It means builds are fully scripted, and the build process occurs as code changes are made. It means upon a successful build, the application stack is bundled and passed along for testing. It means that test code is built prior to unit, functional, regression and security testing, and these tests commence automatically when a new bundle is available. It also means, before tests can be launched, that test systems are automatically provisioned, configured and seeded with the necessary data. And these automation scripts must provide monitored for each part of the process, and that the communication of success or failure is sent back to Dev and Operations teams as events occur. The creation of the scripts and tools to make all this possible means operations, testing and development teams to work closely together. And this orchestration does not happen overnight; it’s commonly an evolutionary process that takes months to get the basics in place, and years to mature.

Continuous Deployment

Continuous Deployment looks very similar to CI, but is focused on the release – as opposed to build – of software to end users. It involves a similar set of packaging, testing, and monitoring, but with some additional wrinkles. The following graphic was created by Rich Mogull to show both the flow of code, from check-in to deployment, and many of the tools that provide automation support.

Sample DevOps Pipeline

Upon a successful completion of a CI cycle, the results feed the Continuous Deployment (CD) process. And CD takes another giant step forward in terms of automation and resiliency. CD continues the theme of building in tools and infrastructure that make development better _first, and functions second. CD addresses dozens of issues that plague code deployments, specifically error prone manual changes and differences in revisions of supporting libraries between production and dev. But perhaps most important is the use of the code and infrastructure to control deployments and rollback in the event of errors. We’ll go into more detail in the following sections.

This is far from a complete description, but hopefully you get enough of the basic idea of how it works. With the basic mechanics of DevOps in mind, let’s now map security in. The differences between what you do today should stand in stark contrast to what you do with DevOps.

Security Integration From An SDLC Perspective

Secure Development Lifecycle’s (SDLC), or sometimes called Secure Software Development Lifecycle’s, describe different functions within software development. Most people look at the different phases in an SDLC and think ‘Waterfall Development process’, which makes discussing SDLC in conjunction with DevOps seem convoluted. But there are good reasons for doing this; Architecture, design, development, testing and deployment phases of an SDLC map well to roles in the development organization regardless of development process, and they provide a jump-off point for people to take what they know today and morph that into a DevOps framework.


  • Operational standards: Typically in the early phases of software development, you’re focused on the big picture of application architecture and how large functional pieces will work. With DevOps, you’re also weaving in the operational standards for the underlying environment. Just as with the code you deploy, you want to make small iterative improvements every day with your operational environment. This will include updates to the infrastructure (_e.g.: build automation tools, CI tools), but also policies for application stack security, including how patches are incorporated, version synchronization over the entire build chain, leveraging of tools and metrics, configuration management and testing. These standards will form the stories which are sent to the operations team for scripting during the development phase discussed below.
  • Security functional requirements: What security tests will you run, which need to be run prior to deployment, and what tools are you going to use to get there. At a minimum, you’ll want to set security requirements for all new code, and what development team will need to test prior to certification. This could mean a battery of unit tests for specific threats your team must write to check for – as an example – the OWASP top ten list of vulnerabilities. Or you may choose commercial products: You have a myriad of security tools at your disposal and not all of them have APIs or capability to be fully integrated into DevOps. Similarly, many tests do not run as fast as your deployment cycle, so you have some difficult decision to make - more on parallel security testing below.
  • Monitoring and metrics: If you’re going to make small iterative improvements with each release, what needs fixing? What’s going to slow? What is working and how do you prove it? Metrics are key to answering these questions. You will need to think about what data you want to collect, and build this into the CI and CD environment to measure how your scripts and testing perform. You’ll continually evolve the collection and use of metrics, but basic collection and dissemination of data should be in your plan from the get-go.


  • Secure design/architecture: DevOps provides a means for significant advancements in security design and architecture. Most notably, since you’re goal is to automate patching and configurations for deployment, it’s possible to entirely disable administrative connections to production servers. Errors and misconfigurations are fixed in build and automation scripts, not through manual logins. Configuration, automated injection of certificates, automated patching and even pre-deployment validation are all possible. It’s also possible to completely disable network ports and access points commonly used for administration purposes, or what is a common attack vector. Leveraging deployment APIs form PaaS and IaaS cloud services, you have even more automation choices, which we will discuss later in this paper. DevOps offers a huge improvement to basic system security, but you specifically must design – or re-design – your deployments to leverage the advantages what automated CI and CD provide.
  • Secure the deployment pipeline: With greater control over the both development and production environments, development and test servers become a more attractive target. Traditionally these environments are run with little or no security. But there is a greater need for security of source code management, build servers and deployment pipeline given the can possibly feed directly - and with minimal human intervention - directly into production. You’ll need to employ stricter controls over access to these systems, specifically build servers and code management. And given less human oversight of scripts running continuously in the background, you’ll need to ensure added monitoring so errors and misuse can be detected and corrected.
  • Threat model: We maintain that threat modeling is one of the most productive exercises in security. DevOps does not change that. It does however open up opportunities for security team members to both instruct dev team members on common threat types, as well as help them plan unit tests for these type of attacks.


  • Infrastructure and Automation First: You need tools before you can build a house, and you need a road before you drive a car somewhere. With DevOps, and specifically security within DevOps, integrating tools and building the tests are done before you begin developing the next set of features. We stress this point both because it makes planning more important, and it helps development plan for tools and test it needs to deploy before they can deliver new code. The bad news is that there is up front cost and work to be done; the good news is that each and every build now leverages the infrastructure and tools you’ve built.
  • Automated and Validated: Remember, it’s not just development that is writing code and building scripts; operations is now up to their elbows in it as well. This is how DevOps helps the fields of patching and hardening to a new level. IT’s role in DevOps is to provide build scripts that build out the infrastructure needed for development, testing and production servers. The good news is that what works in testing should be identical to production. And automation help eliminate the problem traditional IT has faced for years: ad-hoc and undocumented work that runs months, or even years, behind on patching. Again, there is a lot of work to get this fully automated; servers, network configuration, applications and so on. Most teams we spoke with build new machine images every week, and update the scripts which apply patches, updating configurations and build scripts for different environments. But the work ensures consistency and secure baseline from which to start.
  • Security Tasks: A core tenant of Continuous Integration is never check in broken or un-tested code. What constitutes broken or un-tested is up to you. Keep in mind that what’s happening here is, rather than write giant specification documents for code quality or security – like you used to do for waterfall – you’re documenting policies in functional scripts and programs. Unit tests and functional tests not only define, but enforce, security requirements.
  • Security in the Scrum: As we mentioned in the last section, DevOps is process neutral. You can use spiral, or Agile, or surgical-team approaches as you wish. That said, the use of Agile Scrums and Kanban techniques are ideally suited for use with DevOps. The focus on smaller, focused, quickly demonstrable tasks are in natural alignment. For security tasks, they are no less important than any other structural or feature improvement. We recommend training at least one person on each team on security basics, and determine which team members have an interest in security topics to build in-house expertise. In this way security tasks can easily be distributed to those members who have interest and skill tackling security related problems.


  • Strive for failure: In many ways DevOps turns long held principles - in both IT and software development - upside down. Durability used to mean ‘uptime’, and now it’s the speed of replacement. Detailed specifications were used to coordinate dev teams, now it’s a post-it notes. Quality assurance focused on getting code to pass functional requirements, now it looks for ways to break an application before someone else can. It’s this latter point change in approach which really helps raise the bar on security. Stealing a line from James Wickett’s Gauntlt page: “Be Mean To Your Code - And Like It” embodies the ideal. Not only is the goal to build security tests into the automated delivery process, but to greatly raise the bar on what is acceptable to release. We harden an application by intentionally pummeling it with all sorts of functional, stress and security tests before code goes live, reducing the time of hands-on security experts testing code. If you can figure out some way to break your application, odds are attacker can too, so build the test – and the remedy – before code goes live.
  • Parallelize Security Testing: A problem with all agile development approaches is what to do about tests that take longer than the development cycle? For example, we know that fuzz testing critical pieces of code takes longer in duration than you average sprint within an Agile development model. DevOps is no different in this regard; with CI and CD, as code may delivered to users within hours of being created, it is simply not possible to perform white-box or dynamic code scans during this window of time. To help address this issue, DevOps teams run multiple security tests in parallel. Validation against known critical issues are written as unit tests to perform a quick spot check, with failures kicking code back to the development team. Code scanners are commonly run in parallel, against periodic – instead of every – release. The results are also sent back to development, and similarly identify the changes that created the vulnerability, but these tests commonly do not gate a release. How to deal with these issues caused headaches in every dev team we spoke with. Focusing scans on specific areas of the code helps find issues faster, and minimizes the disruption from lagging tests, but remains an area security and development team members struggle with.


  • Manual vs. Automated deployment: It’s easy enough to push new code into production. Vetting that code, or rolling back in the events of errors is always tricky. Most teams we spoke with are not at the stage where they are fully comfortable with fully automated deployments. In fact many still only release new code to their customers every few weeks, often in conjunction with the end of a recent sprint. For these companies most actions are executed through scripts, but the scripts are run manually, when IT and development resources can be on hand to fully monitor the code push. A handful of organizations are fully comfortable with fully-automated pushes to production, and release code several times a day. There is no right answer here, but in either case, automation performs the bulk of the work, freeing others up to test and monitor.
  • Deployment and Rollback: To double-check that code which worked in pre-deployment tests still works in the development environment, the teams we spoke with still do ‘smoke’ tests, but they have evolved these tests to incorporate automation and more granular control over the rollouts. In fact, we typically saw three tricks used to augment deployment. The first, and most powerful, of these techniques is called Blue-Green – or Red-Black – deployments. Simply put, old code and new code run side by side, each on their own set of servers. Rollout is done with a simple redirection from load balancers, and in the event errors are discovered, load balancers are re-directed back to the old code. The second, canary testing, is where a select number of individual sessions are directed towards the new code; first employee testers, then a subset of real customers. If the canary dies (i.e.: any errors encountered), the new code is retired until the issued can be fixed, and the process is repeated. And finally, feature tagging, where new code elements are enabled or disabled through configuration files. In the event errors are discovered in a new section of code, the feature can be toggled off, and the code replaced when fixed. The degree of automation and human intervention varies greatly, but overall, these deployments are far more automated that traditional web services environments.
  • Product Security Tests: With the above mentioned deployment models built into release management scripts, it’s fairly easy to have the ‘canaries’ be dynamic code scanners, pen testers or other security oriented testers. When coupled with test accounts specifically used for somewhat invasive security tests, the risk of data corruption is lowered, while still allowing security tests to be performed in the production environment.

I’ve probably missed a few in this discussion, so please feel free to contribute any ideas you feel should be discussed.

In the next post, I am again going to shake up the order in this series, and talk about tools and testing in greater detail. Specifically I will construct a security tool chain for addressing different types of threats, and showing how these fit within DevOps processes.

—Adrian Lane

Tuesday, September 29, 2015

Pragmatic Security for Cloud and Hybrid Networks: Design Patterns

By Rich

This is the fourth post in a new series I’m posting for public feedback, licensed by Algosec. Well, that is if they like it – we are sticking to our Totally Transparent Research policy. I’m also live-writing the content on GitHub if you want to provide any feedback or suggestions. Click here for the first post in the series, [here for post two](https://securosis.com/blog/pragmatic-security-for-cloud-and-hybrid-networks-cloud-networking-101, post 3, post 4.

To finish off this research it’s time to show what some of this looks like. Here are some practical design patterns based on projects we have worked on. The examples are specific to Amazon Web Services and Microsoft Azure, rather than generic templates. Generic patterns are less detailed and harder to explain, and we would rather you understand what these look like in the real world.

Basic Public Network on Microsoft Azure

This is a simplified example of a public network on Azure. All the components run on Azure, with nothing in the enterprise data center, and no VPN connections. Management of all assets is over the Internet. We can’t show all the pieces and configuration settings in this diagram, so here are some specifics:

Basic Public Network on Azure

  • The Internet Gateway is set in Azure by default (you don’t need to do anything). Azure also sets up default service endpoints for the management ports to manage your instances. These connections are direct to each instance and don’t run through the load balancer. They will (should) be limited to only your current IP address, and the ports are closed to the rest of the world. In this example we have a single public facing subnet.
  • Each instance gets a public IP address and domain name, but you can’t access anything that isn’t opened up with a defined service endpoint. Think of the endpoint as port forwarding, which it pretty much is.
  • The service endpoint can point to the load balancer, which in turn is tied to the auto scale group. You set rules on instance health, performance, and availability; the load balancer and auto scale group provision and deprovision servers as needed, and handle routing. The IP addresses of the instances change as these updates take place.
  • Network Security Groups (NSGs) restrict access to each instance. In Azure you can also apply them to subnets. In this case we would apply them on a per-server basis. Traffic would be restricted to whatever services are being provided by the application, and would deny traffic between instances on the same subnet. Azure allows such internal traffic by default, unlike Amazon.
  • NSGs can also restrict traffic to the instances, locking it down to only from the load balancer and thus disabling direct Internet access. Ideally you never need to log into the servers because they are in an auto scale group, so you can also disable all the management/administration ports.

There is more, but this pattern produces a hardened server, with no administrative traffic, protected with both Azure’s default protections and Network Security Groups. Note that on Azure you are often much better off using their PaaS offerings such as web servers, instead of manually building infrastructure like this.

Basic Private Network on Amazon Web Services

Amazon works a bit differently than Azure (okay – much differently). This example is a Virtual Private Cloud (VPC, their name for a virtual network) that is completely private, without any Internet routing, connected to a data center through a VPN connection.

Basic Private Network on AWS

  • This shows a class B network with two smaller subnets. In AWS you would place each subnet in a different Availability Zone (what we called a ‘zone’) for resilience in case one goes down – they are separate physical data centers.
  • You configure the VPN gateway through the AWS console or API, and then configure the client side of the VPN connection on your own hardware. Amazon maintains the VPN gateway in AWS; you don’t directly touch or maintain it, but you do need to maintain everything on your side of the connection (and it needs to be a hardware VPN).
  • You adjust the routing table on your internal network to send all traffic for the network over the VPN connection to AWS. This is why it’s called a ‘virtual’ private cloud. Instances can’t see the Internet, but you have that gateway that’s Internet accessible.
  • You also need to set your virtual routing table in AWS to send Internet traffic back through your corporate network if you want any of your assets to access the Internet for things like software updates. Sometimes you do, sometimes you don’t – we don’t judge.
  • By default instances are protected with a Security Group that denies all inbound traffic and allows all outbound traffic. Unlike in Azure, instances on the same subnet can’t talk to each other. You cannot connect to them through the corporate network until you open them up. AWS Security Groups offer allow rules only. You cannot explicitly deny traffic – only open up allowed traffic. In Azure you create Service Endpoints to explicitly route traffic, then use network security groups to allow or deny on top of that (within the virtual network). AWS uses security groups for both functions – opening a security group allows traffic through the private IP (or public IP if it is public facing).
  • Our example uses no ACLs but you could put an ACL in place to block the two subnets from talking to each other. ACLs in AWS are there by default, but allow all traffic. An ACL in AWS is not stateful, so you need to create rules for all bidrectional traffic. ACLs in AWS work better as a deny mechanism.
  • A public network on AWS looks relatively similar to our Azure sample (which we designed to look similar). The key differences are how security groups and service endpoints function.

Hybrid Cloud on Azure

This builds on our previous examples. In this case the web servers and app servers are separated, with app servers on a private subnet. We already explained the components in our other examples, so there is only a little to add:

Hybrid on Azure

  • The key security control here is a Network Security Group to restrict access to the app servers from ONLY the web servers, and only to the specific port and protocol required.
  • The NSG should be applied to each instance, not to the subnets, to prevent a “flat network” and block peer traffic that could be used in an attack.
  • The app servers can connect to your datacenter, and that is where you route all Internet traffic. That gives you just as much control over Internet traffic as with virtual machines in your own data center.
  • You will want to restrict traffic from your organization’s network to the instances (via the NSGs) so you don’t become the weak link for an attack.

A Cloud Native Data Analytics Architecture

Our last example shows how to use some of the latest features of Amazon Web Services to create a new cloud-native design for big data transfers and analytics.

Data Transfer and Analysis on AWS

  • In this example there is a private subnet in AWS, without either Internet access or a connection to the enterprise data center. Images will be created in either another account or a VPC, and nothing will be manually logged into.
  • When an analytics job is triggered, a server in the data center takes the data and sends it to Amazon S3, their object storage service, using command line tools or custom code. This is an encrypted connection by default, but you could also encrypt the data using the AWS Key Management Service (or any encryption tool you want). We have clients using both options.
  • The S3 bucket in AWS is tightly restricted to either only the IP address of the sending server, or a set of AWS IAM credentials – or both. AWS manages S3 security so you don’t worry about network attacks, merely enable access. S3 isn’t like a public FTP server – if you lock it down (easy to do) it isn’t visible except from authorized sources.
  • A service called AWS Lambda monitors the S3 bucket. Lambda is a container for event-driven code running inside Amazon that can trigger based on internal things, including a new file appearing in an S3 bucket. You only pay for Lambda when your code is executing, so there is no cost to have it wait for events.
  • When a new file appears the Lambda function triggers and launches analysis instances based on a standard image. The analysis instances run in a private subnet, with security group settings that block all inbound access.
  • When the analysis instances launch the Lambda code sends them the location of the data in S3 to analyze. The instances connect to S3 through something known as a VPC Endpoint, which is totally different from an Azure service endpoint. A VPC endpoint allows instances in a totally private subnet to talk to S3 without Internet access (which was required until recently). As of this writing only S3 has a VPC endpoint, but we know Amazon is working on endpoints for additional services such as their Simple Queue Service (we suspect AWS hasn’t confirmed exactly which services are next on the list).
  • The instances boot, grab the data, then do their work. When they are done they go through the S3 VPC Endpoint to drop their results into a second S3 bucket.
  • The first bucket only allows writes from the data center, and reads from the private subnet. The second bucket reverses that and only allows reads from the data center and writes from the subnet. Everything is a one-way closed loop.
  • The instance can then trigger another Lambda function to send a notification back to your on-premise data center or application that the job is complete, and code in the data center can grab the results. There are several ways to do this – for example the results could go into a database, instead.
  • Once everything is complete Lambda moves the original data into Glacier, Amazon’s super-cheap long-term archival storage. In this scenario it is of course encrypted. (For this network-focused research we are skipping over most of the encryption options for this architecture, but they aren’t overly difficult).

Think about what we have described: the analysis servers have no Internet access, spin up only as needed, and can only read in new data and write out results. They automatically terminate when finished, so there is no persistent data sitting unused on a server or in memory. All Internet-facing components are native Amazon services, so we don’t need to maintain their network security. Everything is extremely cost-effective, even for very large data sets, because we only process when we need it; big data sets are always stored in the cheapest option possible, and automatically shifted around to minimize storage costs. The system is event-driven so if you load 5 jobs at once, it runs all 5 at the same time without any waiting or slowdown, and if there are no jobs the components are just programmatic templates, in the absolute most cost-effective state.

This example does skip some options that would improve resiliency in exchange for better network security. For example we would normally recommend using Simple Queue Service to manage the jobs (Lambda would send them over), because SQS handles situations such as an instance failing partway through processing. But this is security research, not availability focused.


This research isn’t the tip of the iceberg; it’s more like the first itty bitty little ice crystal on top of an iceberg, which stretches to the depths of the deepest ocean trench. But if you remember the following principles you will be fine as you dig into securing your own cloud and hybrid deployments:

  • The biggest differences between cloud and traditional networks is the combination of abstraction (virtualization) and automation. Things look the same but don’t function the same.
  • Everything is managed by software, providing tremendous flexibility, and enabling you to manage network security using the exact same tools that Development and Operations use to manage their pieces of the puzzle.
  • You can achieve tremendous security through architecture. Virtual networks (and multiple cloud accounts) support incredible degrees of compartmentalization, where every project has its own dedicated network or networks.
  • Security groups enhance that by providing the granularity of host firewalls, without the risks of relying on operating systems. They provide better manageability than even most network firewalls.
  • Platform as a Service and cloud-provider-specific services open up entirely new architectural options. Don’t try to build things the way you always have. Actually, if you find yourself doing that, you should probably rethink your decision to use the cloud.

Don’t be intimidated by cloud computing, but don’t think you can or should implement network security the way you always have. Your skills and experiences are still important, and provide a base to build on as you learn all the new options available within the cloud.


Monday, September 28, 2015

Pragmatic Security for Cloud and Hybrid Networks: Building Your Cloud Network Security Program

By Rich

This is the fourth post in a new series I’m posting for public feedback, licensed by Algosec. Well, that is if they like it – we are sticking to our Totally Transparent Research policy. I’m also live-writing the content on GitHub if you want to provide any feedback or suggestions. Click here for the first post in the series, [here for post two](https://securosis.com/blog/pragmatic-security-for-cloud-and-hybrid-networks-cloud-networking-101, post 3).

There is no single ‘best’ way to secure a cloud or hybrid network. Cloud computing is moving faster than any other technology in decades, with providers constantly struggling to out-innovate each other with new capabilities. You cannot lock yourself into any single architecture, but instead need to build out a program capable of handling diverse and dynamic needs.

There are four major focus areas when building out this program.

  • Start by understanding the key considerations for the cloud platform and application you are working with.
  • Design the network and application architecture for security.
  • Design your network security architecture including additional security tools (if needed) and management components.
  • Manage security operations for your cloud deployments – including everything from staffing to automation.

Understand Key Considerations

Building applications in the cloud is decidedly not the same as building them on traditional infrastructure. Sure, you can do it, but the odds are high something will break. Badly. As in “update that resume” breakage. To really see the benefits of cloud computing, applications must be designed specifically for the cloud – including security controls.

For network security this means you need to keep a few key things in mind before you start mapping out security controls.

  • Provider-specific limitations or advantages: All providers are different. Nothing is standard, and don’t expect it to ever become standard. One provider’s security group is another’s ACL. Some allow more granular management. There may be limits on the number of security rules available. A provider might offer both allow and deny rules, or allow only. Take the time to learn the ins and outs of your provider’s capabilities. They all offer plenty of documentation and training, and in our experience most organizations limit themselves to no more than one to three infrastructure providers, keeping the problem manageable.
  • Application needs: Applications, especially those using the newer architectures we will mention in a moment, often have different needs than applications deployed on traditional infrastructure. For example application components in your private network segment may still need Internet access to connect to a cloud component – such as storage, a message bus, or a database. These needs directly affect architectural decisions – both security and otherwise.
  • New architectures: Cloud applications use different design patterns than apps on traditional infrastructure. For example, as previously mentioned, components are typically distributed across diverse network locations for resiliency, and tied tightly to cloud-based load balancers. Early cloud applications often emulated traditional architectures but modern cloud applications make extensive use of advanced cloud features, particularly Platform as a Service, which may be deeply integrated into a particular cloud provider. Cloud-based databases, message queues, notification systems, storage, containers, and application platforms are all now common due to cost, performance, and agility benefits. You often cannot even control the network security of these services, which are instead fully managed by the cloud provider. Continuous deployment, DevOps, and immutable servers are the norm rather than exceptions. On the upside, used properly these architectures and patterns are far more secure, cost effective, resilient, and agile than building everything yourself, but you do need to understand how they work.

Data Analytics Design Pattern Example

A common data analytics design pattern highlights these differences (see the last section for a detailed example). Instead of keeping a running analytics pool and sending it data via SFTP, you start by loading data into cloud storage directly using an (encrypted) API call. This, using a feature of the cloud, triggers the launch of a pool of analytics servers and passes the job on to a message queue in the cloud. The message queue distributes the jobs to the analytics servers, which use a cloud-based notification service to signal when they are done, and the queue automatically redistributes failed jobs. Once it’s all done the results are stored in a cloud-based NoSQL database and the source files are archived. It’s similar to ‘normal’ data analytics except everything is event-driven, using features and components of the cloud service. This model can handle as many concurrent jobs as you need, but you don’t have anything running or racking up charges until a job enters the system.

  • Elasticity and a high rate of change are standard in the cloud: Beyond auto scaling, cloud applications tend to alter the infrastructure around them to maximize the benefits of cloud computing. For example one of the best ways to update a cloud application is not to patch servers, but instead to create an entirely new installation of the app, based on a template, running in parallel; and then to switch traffic over from the current version. This breaks familiar security approaches, including relying on IP addresses for: server identification, vulnerability scanning, and logging. Server names and addresses are largely meaningless, and controls that aren’t adapted for cloud are liable to be useless.
  • Managing and monitoring security changes: You either need to learn how to manage cloud security using the provider’s console and APIs, or choose security tools that integrate directly. This may become especially complex if you need to normalize security between your data center and cloud provider when building a hybrid cloud. Additionally, few cloud providers offer good tools to track security changes over time, so you will need to track them yourself or use a third-party tool.

Design the Network Architecture

Unlike traditional networks, security is built into cloud networks by default. Go to any major cloud provider, spin up a virtual network, launch a server, and the odds are very high it is already well-defended – with most or all access blocked by default.

Because security and core networking are so intertwined, and every cloud application has its own virtual network (or networks), the first step toward security is to work with the application team and design it into the architecture.

Here are some specific guidelines and recommendations:

  • Accounts provide your first layer of segregation. With each cloud provider you establish multiple accounts, each for a different environment (e.g., dev, test, production, logging). This enables you to tailor cloud security controls and minimize administrator access. This isn’t a purely network security feature, but will affect network security because you can, for example, have tighter controls for environments closer to production data. The rule of thumb for accounts is to consider separate accounts for separate applications, and then separate accounts for a given application when you want to restrict how many people have administrator access. For example a dev account is more open with more administrators, while production is a different account with a much smaller group of admins. Within accounts, don’t forget about the physical architecture:
    • Regions/locations are often used for resiliency, but may also be incorporated into the architecture for data residency requirements, or to reduce network latency to customers. Unlike accounts, we don’t normally use locations for security, but you do need to build network security within each location.
    • Zones are the cornerstone of cloud application resiliency, especially when tied to auto scaling. You won’t use them as a security control, but again they affect security, as they often map directly to subnets. An auto scale group might keep multiple instances of a server in different zones, which are different subnets, so you cannot necessarily rely on subnets and addresses when designing your security.
  • Virtual Networks (Virtual Private Clouds) are your next layer of security segregation. You can (and will) create and dedicate separate virtual networks for each application (potentially in different accounts), each with its own set of network security controls. This compartmentalization offers tremendous security advantages, but seriously complicates security management. It forces you to rely much more heavily on automation, because manually replicating security controls across accounts and virtual networks within each account takes tremendous discipline and effort. In our experience the security benefits of compartmentalization outweigh the risks created by management complexity – especially because development and operations teams already tend to rely on automation to create, manage, and update environments and applications in the first place. There are a few additional non-security-specific aspects to keep in mind when you design the architecture:
    • Within a given virtual network, you can include public and private facing subnets, and connect them together. This is similar to DMZ topologies, except public-facing assets can still be fully restricted from the Internet, and private network assets are all by default walled off from each other. Even more interesting, you can spin up totally isolated private network segments that only connect to other application components through an internal cloud service such as a message queue, and prohibit all server-to-server traffic over the network.
    • There is no additional cost to spin up new virtual networks (or at least if your provider charges for this, it’s time to move on), and you can create another with a few clicks or API calls. Some providers even allow you to bridge across virtual networks, assuming they aren’t running on the same IP address range. Instead of trying to lump everything into one account and one virtual network, it makes far more sense to use multiple networks for different applications, and even within a given application architecture.
    • Within a virtual network you also have complete control over subnets. While they may play a role in your security design, especially as you map out public and private network segments, make sure you also design them to support zones for availability.
    • Flat networks aren’t flat in the cloud. Everything you deploy in the virtual network is surrounded by its own policy-based firewall which blocks all connections by default, so you don’t need to rely on subnets themselves as much for segregation between application components. Public vs. private subnets are one thing, but creating a bunch of smaller subnets to isolate application components quickly leads to diminishing returns.

Hybrid Clouds

You may need enterprise datacenter connections for hybrid clouds. These VPN or direct connections route traffic directly from your data center to the cloud, and vice-versa. You simply set your routing tables to send traffic to the appropriate destination, and SDN-based virtual networks allow you to set distinct subnet ranges to avoid address conflicts with existing assets.

Whenever possible, we actually recommend avoiding hybrid cloud deployments. It isn’t that there is anything wrong with them, but they make it much more difficult to support account and virtual network segregation. For example if you use separate accounts or virtual networks for your different dev/test/prod environments, you will tend to do so using templates to automatically build out your architecture, and they will perfectly mimic each other – down to individual IP addresses. But if you connect them directly to your data center you need to shift to non-overlapping address ranges to avoid conflicts, and they can’t be as automated or consistent. (This consistency is a cornerstone of continuous deployment and DevOps).

Additionally, hybrid clouds complicate security. We have actually seen them, not infrequently, reduce the overall security level of the cloud, because assets in the datacenter aren’t as segregated as on the cloud network, and cloud providers tend to be more secure than most organizations can achieve in their own infrastructure. Instead of cracking your cloud provider, someone only needs to crack a system on your corporate network, and use that to directly bridge to the cloud.

So when should you consider a hybrid deployment? Any time your application architecture requires direct address-based access to an internal asset that isn’t Internet-accessible. Alternatively, sometimes you need a cloud asset on a static, non-Internet-routable address – such as an email server or other service that isn’t designed to work with auto scaling – which internal things need to connect to. (We strongly recommend you minimize these – they don’t benefit from cloud computing, so there usually isn’t a good reason to deploy them there). And yes, this means hybrid deployments are extremely common unless you are building everything from scratch. We try to minimize their use – but that doesn’t mean they don’t play a very important role.

For security there are a few things to keep in mind when building a hybrid deployment:

  • VPN traffic will traverse the Internet. VPNs are very secure, but you do need to keep them up-to-date with the latest patches and make sure you use strong, up-to-date certificates.
  • Direct connections may reduce latency, but decide whether you trust your network provider, and whether you need to encrypt traffic.
  • Don’t let your infrastructure reduce the security of your cloud. If you mandate multi-factor authentication in the cloud but not on your LAN, that’s a loophole. Is your entire LAN connected to the cloud? Could someone compromise a single workstation and then start attacking your cloud through your direct connection? Do you have security group or other firewall rules to keep your cloud assets as segregated from datacenter assets as they are from each other? Remember, cloud providers tend to be exceptionally good at security, and everything you deploy in the cloud is isolated by default. Don’t allow hybrid connection to become the weak link and reduce this compartmentalization.
  • You may still be able to use multiple accounts and virtual networks for segregation, by routing different datacenter traffic to different accounts and/or virtual networks. But your on-premise VPN hardware or your cloud provider might not support this, so check before building it into your architecture.
  • Cloud and on-premise network security controls may look similar on the surface, but they have deep implementation differences. If you want unified management you need to understand these differences, and be able to harmonize based on security goals – not by trying to force a standard implementation across very different technologies.
  • Cloud computing offers many more ways to integrate into your existing operations than you might think. For example instead of using SFTP and setting up public servers to receive data dumps, consider installing your cloud provider’s command-line tools and directly transferring data to their object storage service (fully locked down, of course). Now you don’t need to maintain the burden of either an Internet-accessible FTP server or a hybrid cloud connection.

It’s hard to fully convey the breadth and depth of options for building security into your architectures, even without additional security tools. This isn’t mere theory – we have a lot of real-world experience with different architectures creating much higher security levels than can be achieved on traditional infrastructure at any reasonable cost.

Design the Network Security Architecture

At this point you should have a well-segregated environment where effectively every application, and every environment (e.g., dev/test) for every application, is running on its own virtual network. These assets are mostly either in auto scale groups which spread them around zones and subnets for resiliency; or connect to secure cloud services such as databases, message queues, and storage. These architectures alone, in our experience, are materially more secure than your typical starting point on traditional infrastructure.

Now it’s time to layer on the additional security controls we covered earlier under Cloud Networking 101. Instead of repeating the pros and cons, here are some direct recommendations about when to use each option:

  • Security groups: These should be used by default, and set to deny by default. Only open up the absolute minimum access needed. Cloud services allow you to right-size resources far more easily than on your own hardware, so we find most organizations tend to deploy far fewer number services on each instance, which directly translates to opening up fewer network ports per instance. A large number of cloud deployments we have evaluated use only a good base architecture and security groups for network security.
  • ACLs: These mostly make sense in hybrid deployments, where you need to closely match or restrict communications between the data center and the cloud. Security groups are usually a better choice, and we only recommend falling back to ACLs or subnet-level firewalling when you cannot achieve your security objectives otherwise.
  • Virtual Appliances: Whenever you need capabilities beyond basic firewalls, this is where you are likely to end up. But we find host agents often make more sense when they offer the same capabilities, because virtual appliances become costly bottlenecks which restrict your cloud architecture options. Don’t deploy one merely because you have a checkbox requirement for a particular tool – ensure it makes sense first. Over time we do see them becoming more “cloud friendly”, but when we rip into requirements on projects, we often find there are better, more cloud-appropriate ways to meet the same security objectives.
  • Host security agents are often a better option than a virtual appliance because they don’t restrict virtual networking architectural options. But you need to ensure you have a way to deploy them consistently. Also, make sure you pick cloud-specific tools designed to work with features such as auto scaling. These tools are particularly useful to cover network monitoring gaps, meet IDS/IPS requirements, and satisfy all your normal host security needs.

Of course you will need some way of managing these controls, even if you stick to only capabilities and features offered by your cloud provider.

Security groups and ACLs are managed via API or your cloud provider’s console. They use the same management plane as the rest of the cloud, but this won’t necessarily integrate out of the box with the way you manage things internally. You can’t track these across multiple accounts and virtual networks unless you use a purpose-built tool or write your own code. We will talk about specific techniques for management in the next section, but make sure you plan out how to manage these controls when you design your architecture.

Platform as a Service introduces its own set of security differences. For example in some situations you still define security groups and/or ACLs for the platform (as with a cloud load balancer); but in other cases access to the platform is only via API, and may require an outbound public Internet connection, even from a private network segment. PaaS also tends to rely more on DNS rather than IP addresses, to help the cloud provider maintain flexibility. We can’t give you any hard and fast rules here. Understand what’s required to connect to the platform, and then ensure your architecture allows those connections. When you can manage security treat it like any other cluster of servers, and stick with the minimum privileges possible.

We cannot cover anything near every option for every cloud in a relatively short (believe it or not) paper like this, but for the most part once you understand these fundamentals and the core differences of working in software-defined environments, it gets much easier to adapt to new tools and technologies.

Especially once you realize that you start by integrating security into the architecture, instead of trying to layer it on after the fact.

Manage Cloud (and Hybrid) Network Security Operations

Building in security is one thing, but keeping it up to date over time is an entirely different – and harder – problem. Not only do applications and deployments change over time, but cloud providers have this pesky habit of “innovating” for “competitive advantage”. Someday things might slow down, but it definitely won’t be within the lifespan of this particular research.

Here are some suggestions on managing cloud network security for the long haul.

Organization and Staffing

It’s a good idea to make sure you have cloud experts on your network security team, people trained for the platforms you support. They don’t need to be new people, and depending on your scale this doesn’t need to be their full-time focus, but you definitely need the skills. We suggest you build your team with both security architects (to help in design) and operators (to implement and fix).

Cloud projects occur outside the constraints of your data center, including normal operations, which means you might need to make some organizational changes so security is engaged in projects. A security representative should be assigned and integrated into each cloud project. Think about how things normally work – someone starts a new project and security gets called when they need access or firewall rule changes. With cloud computing network security isn’t blocking anything (unless they need access to an on-premise resource) and entire projects can happen without security or ops every being directly involved. You need to adapt policies and organizational structure to minimize this risk. For example, work with procurement to require a security evaluation and consultation before any new cloud account is opened.

Because so much of cloud network security relies on architecture, it isn’t just important to have a security architect on the team – it is essential they be engaged in projects early. It goes without saying that this needs to be a collaborative role. Don’t merely write up some pre-approved architectures, and then try to force everyone to work within those constraints. You’ll lose that fight before you even know it started.


We hinted at this in the section above: one of the first challenges is to find all the cloud projects, and then keep finding new ones as they pop up over time. You need to enumerate the existing cloud network security controls. Here are a couple ways we have seen clients successfully keep tabs on cloud computing:

  • If your critical assets (such as the customer database) are well locked down, you can use this to control cloud projects. If they want access to the data/application/whatever, they need to meet your security requirements.
  • Procurement and Accounting are your next best options. At some point someone needs to pay the (cloud) piper, and you can work with Accounting to identify payments to cloud providers and tie them back to the teams involved. Just make sure you differentiate between those credit card charges to Amazon for office supplies, and the one to replicate your entire datacenter up into AWS.
  • Hybrid connections to your data center are pretty easy to track using established process. Unless you let random employees plug in VPN routers.
  • Lastly, we suppose you could try setting a policy that says “don’t cloud without telling us”. I mean, if you trust your people and all. It could work. Maybe. It’s probably good to have to keep the auditors happy anyway.

The next discovery challenge is to figure out how the cloud networks are architected and secured:

  • First, always start with the project team. Sit down with them and perform an architecture and implementation review.
  • It’s a young market, but there are some assessment tools that can help. Especially to analyze security groups and network security, and compare against best practices.
  • You can use your cloud provider’s console in many cases, but most of them don’t provide a good overall network view. If you don’t have a tool to help, you can use scripts and API calls to pull down the raw configuration and manually analyze it.

Integrating with Development

In the broadest, sense there are two kinds of cloud deployments: applications you build and run in the cloud (or hybrid), and core infrastructure (like file and mail servers) you transition to the cloud. Developers play the central role in the former, but they are also often involved in the latter.

The cloud is essentially software defined everything. We build and manage all kinds of cloud deployments using code. Even if you start by merely transitioning a few servers into virtual machines at a cloud provider, you will always end up defining and managing much of your environment in code.

This is an incredible opportunity for security. Instead of sitting outside the organization and trying to protect things by building external walls, we gain much greater ability to manage security using the exact same tools development and operations use to define, build, and run the infrastructure and services. Here are a few key ways to integrate with development and ensure security is integrated:

  • Create a handbook of design patterns for the cloud providers you support, including security controls and general requirements. Keep adding new patterns as you work on new projects. Then make this library available to business units and development teams so they know which architectures already have general approval from security.
  • A cloud security architect is essential, and this person or team should engage early with development teams to help build security into every initial design. We hate to have to say it, but their role really needs to be collaborative. Lay down the law with a bunch of requirements that interfere with the project’s execution, and you definitely won’t be invited back to the table.
  • A lot of security can be automated and templated by working with development. For example monitoring and automation code can be deployed on projects without the project team having to develop them from scratch. Even integrating third party tools can often be managed programmatically.

Policy Enforcement

Change is constant in cloud computing. The foundational concept dynamic adjustment of capacity (and configuration) to meet changing demands. When we say “enforce policies” we mean that, for a given project, once you design the security you are able to keep it consistent. Just because clouds change all the time doesn’t mean it’s okay to let a developer drop all the firewalls by mistake.

The key policy enforcement difference between traditional networking and the cloud is that in traditional infrastructure security has exclusive control over firewalls and other security tools. In the cloud, anyone with sufficient authorization in the cloud platform (management plane) can make those changes. Even applications can potentially change their own infrastructure around them. That’s why you need to rely on automation to detect and manage change.

You lose the single point of control. Heck, your own developers can create entire networks from their desktops. Remember when someone occasionally plugged in their own wireless router or file server? It’s a bit like that, but more like building their own datacenter over lunch. Here are some techniques for managing these changes:

  • Use access controls to limit who can change what on a given cloud project. It is typical to allow developers a lot of freedom in the dev environment, but lock down any network security changes in production, using your cloud provider’s IAM features.
  • To the greatest extent possible, try to use cloud provider specific templates to define your infrastructure. These files contain a programmatic description of your environment, including complete network and network security configurations. You load them into the cloud platform and it builds the environment for you. This is a very common way to deploy cloud applications, and essential in organizations using DevOps to enforce consistency.
  • When this isn’t possible you will need to use a tool or manually pull the network architecture and configuration (including security) and document them. This is your baseline.
  • Then you need to automate change monitoring using a tool or the features of your cloud and/or network security provider:
    • Cloud platforms are slowly adding monitoring and alerting on security changes, but these capabilities are still new and often manual. This is where cloud-specific training and staffing can really pay off, and there are also third-party tools to monitor these changes for you.
    • When you use virtual appliances or host security, you don’t rely on your cloud provider, so you may be able to hook change management and policy enforcement into your existing approaches. These are security-specific tools, so unlike cloud provider features the security team will often have exclusive access and be responsible for making changes themselves.
  • Did we mention automation? We will talk about it more in a minute, because it’s the only way to maintain cloud security.

Normalizing On-Premise and Cloud Security

Organizations have a lot of security requirements for very good reasons, and need to ensure those controls are consistently applied. We all have developed a tremendous amount of network security experience over decades running our own networks, which is still relevant when moving into the cloud. The challenge is to carry over the requirements and experience, without assuming everything is the same in the cloud, or letting old patterns prevent us from taking full advantage of cloud computing.

  • Start by translating whatever rules sets you have on-premise into a comparable version for the cloud. This takes a few steps:
    • Figure out which rules should still apply, and what new rules you need. For example a policy to deny all ssh traffic from the Internet won’t work if that’s how you manage public cloud servers. Instead a policy that limits ssh access to your corporate CIDR block makes more sense. Another example is the common restriction that back-end servers shouldn’t have any Internet access at all, which may need updating if they need to connect to PaaS components of their own architecture.
    • Then adjust your policies into enforceable rulesets. For example security groups and ACLs work differently, so how you enforce them changes. Instead of setting subnet-based policies with a ton of rules, tie security group policies to instances by function. We once encountered a client who tried to recreate very complex firewall rulesets into security groups, exceeding their provider’s rule count limit. Instead we recommended a set of policies for different categories of instances.
    • Watch out for policies like “deny all traffic from this IP range”. Those can be very difficult to enforce using cloud-native tools, and if you really have those requirements you will likely need a network security virtual appliance or host security agent. In many projects we find you can resolve the same level of risk with smarter architectural decisions (e.g., using immutable servers, which we will describe in a moment).
    • Don’t just drop in a virtual appliance because you are used to it and know how to build its rules. Always start with what your cloud provider offers, then layer on additional tools as needed.
    • If you migrate existing applications to the cloud the process is a bit more complex. You need to evaluate existing security controls, discover and analyze application dependencies and network requirements, and then translate them for a cloud deployment, taking into account all the differences we have been discussing.
  • Once you translate the rules, normalize operations. This means having a consistent process to deploy, manage, and monitor your network security over time. Fully covering this is beyond to scope of this research, as it depends on how you manage network security operations today. Just remember that you are trying to blend what you do now with what the cloud project’s requirements, not simply enforce your existing processes under an entirely new operating model.

We hate to say it, but we will – this is a process of transition. We find customers who start on a project-by-project basis are more successful, because they can learn as they go, and build up a repository of knowledge and experience.

Automation and Immutable Network Security

Cloud security automation isn’t merely fodder for another paper – it’s an entirely new body of knowledge we are all only just beginning to build.

Any organization that moves to the cloud in any significant way learns quickly that automation is the only way to survive. How else can you manage multiple copies of a single project in different environments – never mind dozens or hundreds of different projects, each running in their own sets of cloud accounts across multiple providers?

Then, keep all those projects compliant with regulatory requirements and your internal security policies.

Yeah, it’s like that.

Fortunately this isn’t an insoluble problem. Every day we see more examples of companies successfully using the cloud at scale, and staying secure and compliant. Today they largely build their own libraries of tools and scripts to continually monitor and enforce changes. We also see some emerging tools to help with this management, and expect to see many more in the near future.

A core developing concept tied to automation is immutable security, and we have used it ourselves.

One of the core problems in security is managing change. We design something, build in security, deploy it, validate that security, and lock everything down. This inevitably drifts as it’s patched, updated, improved, and otherwise modified. Immutable security leverages automation, DevOps techniques, and inherent cloud characteristics to break this cycle. To be honest, it’s really just DevOps applied to security, and all the principles are in wide use already.

For example an immutable server is one that is never logged into or changed in production. If you go back to about auto scaling, we deploy servers based on standard images. Changing one of those servers after deployment doesn’t make sense, because those changes wouldn’t be in the image, so new versions launched by auto scaling wouldn’t include them. Instead DevOps creates a new image with all the changes, then alters the auto scale group rules to deploy new instances based on the new image, and man optionally prune off the older versions.

In other words no more patching, and no more logging into servers. You take a new known-good state, and completely replace what is in production.

Think about how this applies to network security. We can build templates to automatically deploy entire environments at our cloud providers. We can write network security policies, then override any changes automatically, even across multiple cloud accounts. This pushes the security effort earlier into design and development, and enables much more consistent enforcement in operations. And we use the exact same toolchain as Development and Operations to deploy our security controls, rather than trying to build our own on the side and overlay enforcement afterwards.

This might seem like an aside, but these automation principles are the cornerstone of real-world cloud security, especially at scale. This is a capability we never have in traditional infrastructure, where we cannot simply stamp out new environments automatically, and need to hand-configure everything.


Friday, September 25, 2015

Building Security Into DevOps: The Emergence of DevOps

By Adrian Lane

In this post we will outline some of the key characteristics of DevOps. In fact, for those of you new to the concept, this is the most valuable post in this series. We believe that DevOps is one of the most disruptive trends to ever hit application development, and will be driving organizational changes for the next decade. But it’s equally disruptive for application security, and in a good way. It enables security testing, validation and monitoring to be interwoven with application development and deployment. To illustrate why we believe this is disruptive – both for application development and for application security, we are first going to delve into what Dev Ops is and talk about how it changes the entire development approach.

What is it?

We are not to dive too deep into the geeky theoretical aspects of DevOps as it stays outside our focus for this research paper. However, as you begin to practice DevOps you’ll need to delve into it’s foundational elements to guide your efforts, so we will reference several here. DevOps is born out of lean manufacturing, Kaizen and Deming’s principles around quality control techniques. The key idea is a continuous elimination of waste, which results in improved efficiency, quality and cost savings. There are numerous approaches to waste reduction, but key to software development are the concepts of reducing work-in-progress, finding errors quickly to reduce re-work costs, scheduling techniques and instrumentation of the process so progress can be measured. These ideas have been in proven in practice for decades, but typically applied to manufacturing of physical goods. DevOps applies these practices to software delivery, and when coupled with advances in automation and orchestration, become a reality.

So theory is great, but how does that help you understand DevOps in practice? In our introductory post we said:

DevOps is an operational framework that promotes software consistency and standardization through automation. Its focus is on using automation to do a lot of the heavy lifting of building, testing, and deployment. Scripts build organizational memory into automated processes to reduce human error and force consistency.

In essence development, quality assurance and IT operations teams automate as much of their daily work as they can, investing the time up front to make things easier and more consistent over the long haul. And the focus is not just the applications, or even an application stack, but the entire supporting eco-system. One of our commenters for the previous post termed it ‘infrastructure as code’, a handy way to think about the configuration, creation and management of the underlying servers and services that applications rely upon. From code check-in, through validation, to deployment and including run time monitoring; anything used to get applications into the hands of users is part of the assembly. Using scripts and programs to automate builds, functional testing, integration testing, security testing and even deployment, automation is a large part of the value. It means each subsequent release is a little faster, and a little more predictable, than the last. But automation is only half the story, and in terms of disruption, not the most important half.

The Organizational Impact

DevOps represents an cultural change as well, and it’s the change in the way the organization behaves that has the most profound impact. Today, development teams focus on code development, quality assurance on testing, and operations on keeping things running. In practice these three activities are not aligned, and in many firms, become competitive to the point of being detrimental. Under DevOps, development, QA and operations work together to deliver stable applications; efficient teamwork is the job. This subtle change in focus has a profound effect on the team dynamic. It removes much of the friction between groups as they no longer work on their pieces in isolation. It also minimizes many of the terrible behaviors that cause teams grief; incentives to push code before it’s ready, the fire drills to fix code and deployment issues at the release date, over-burdening key people, ad-hoc changes to production code and systems, and blaming ‘other’ groups for what amounts to systemic failures. Yes, automation plays a key role in tackling repetitive tasks, both reducing human error and allowing people to focus on tougher problems. But DevOps effect is almost as if someone opens a pressure relief value when teams, working together, identify and address the things that complicate the job of getting quality software produced. Performing simpler tasks, and doing them more often, releasing code becomes reflexive. Building, buying and integrating tools needed to achieve better quality, visibility and just make things easer help every future release. Success begets success.

Some of you reading this will say “That sounds like what Agile development promised”, and you would be right. But Agile development techniques focused on the development team, and suffers in organizations where project management, testing and IT are not agile. In our experience this is why we see companies fail in their transition to Agile. DevOps focuses on getting your house in order first, targeting the internal roadblocks that introduce errors and slow the process down. Agile and DevOps are actually complementary to one another, with Agile techniques like scrum meetings and sprints fitting perfectly within a DevOps program. And DevOps ideals on scheduling and use of Kanban board’s have morphed into Agile Scrumban tools for task scheduling. These things are not mutually exclusive, rather they fit very well together!

Problems it solves

DevOps solves several problems, many of them I’ve alluded to above. Here I will discuss the specifics in a little greater detail, and the bullets bullet items have some intentional overlap. When you are knee deep in organizational dysfunction, it is often hard to pinpoint the causes. In practice it’s usually multiple issues that both make thing more complicated and mask the true nature of the problem. As such I want to discuss what problems DevOps solve from multiple viewpoints.

  • Reduced errors: Automation reduces errors that are common when performing basic – and repetitive – tasks. And more to the point, automation is intended to stop ad-hoc changes to systems; these commonly go un-recorded, meaning the same problem is forgotten over time, and needs to be fixed repeatedly. By including configuration and code updates within the automation process, settings and distributions are applied consistently - every time. If there is a incorrect setting, the problem is addressed in the automation scripts and then pushed into production, not by altering systems ad-hoc.
  • Speed and efficiency: Here at Securosis we talk a lot about ‘reacting faster and better’, and ‘doing more with less’. DevOps, like Agile, is geared towards doing less, doing it better, and doing it faster. Releases are intended to occur on a more regular basis, with a smaller set of code changes. Less work means better focus, and more clarity of purpose with each release. Again, automation helps people get their jobs done with less hands-on work. But it also helps speed things up: Software builds can occur at programatic speeds. If orchestration scripts can spin up build or test environments on demand, there is no waiting around for IT to provision systems as it’s part of the automated process. If an automated build fails, scripts can pull the new code and alert the development team to the issue. If automated functional or regression tests fail, the information is in QA or developers hands before they finish lunch. Essentially you fail faster, with subsequent turnaround to identify and address issues being quicker as well.
  • Bottlenecks: There are several bottlenecks in software development; developers waiting for specifications, select individuals who are overtasked, provisioning IT systems, testing and even process (i.e.: synchronous ones like waterfall) can cause delays. Both the way that DevOps tasks are scheduled, the reduction in work being performed at any one time, and in the way that expert knowledge is embedded within automation, once DevOps has established itself major bottlenecks common to most development teams are alleviated.
  • Cooperation and Communication: If you’ve ever managed software releases, then you’ve witnessed the ping-pong match that occurs between development and QA. Code and insults fly back and forth between these two groups, that is when they are not complaining about how long it is taking IT to get things patched and new servers available for testing and deployment. The impact of having operations and development or QA work shoulder to shoulder is hard to articulate, but focusing the teams on smaller set of problems they address in conjunction with one another, friction around priorities and communication start to evaporate. You may consider this a ‘fuzzy’ benefit, until you’ve seen it first hand, then you realize how many problems are addressed through clear communication and joint creative efforts.
  • Technical Debt: Most firms consider the job of development to produce new features for customers. Things that developers want – or need – to produce more stable code are not features. Every software development project I’ve ever participated in ended with a long list of things we needed to do to improve the work environment (i.e.: the ‘To Do’ list). This was separate and distinct from new features; new tools, integration, automation, updating core libraries, addressing code vulnerabilities or even bug fixes. As such, project managers ignored it, as it was not their priority, and developers fixed issues at their own peril. This list is the essence of technical debt, and it piles up fast. DevOps looks to reverse the priority set and target technical debt - or anything that slows down work or reduces quality - before adding new capabilities. The ‘fix-it-first’ approach produces higher quality, more reliable software.
  • Metrics and Measurement: Are you better or worse than you were last week? How do you know? The answer is metrics. DevOps is not just about automation, but also about continuous and iterative improvements. The collection of metrics is critical to knowing where to focus your attention. Captured data – from platforms and applications – forms the basis for measuring everything from tangible things like latency and resource utilization, to more abstract concepts like code quality and testing coverage. Metrics are key to know what is working and what could use improvement.
  • Security: Security testing, just like functional testing, regression testing, load testing or just about any other form of validation, can be embedded into the process. Security becomes not just the domain of security experts with specialized knowledge, but part and parcel to the development and delivery process. Security controls can be used to flag new features or gate releases within the same set of controls you would use to ensure custom code, application stack or server configurations are to specification. Security goes from being ‘Dr. No’ to just another set of tests to measure code quality.

And that’s a good place to end this post, as the remainder of this series will focus on blending security with DevOps. Specifically our next discussion will be on the role security should play within a DevOps environment.

In the next post I will dig into the role of security in DevOps, but I hope to get a lot of comments before I launch that post next week. I worked hard to capture the essence of DevOps from the research calls and personal experience in this post. And some of the advantages I mention are not all that clear unless used in cloud and virtual environments where the degree of automation changes what’s possible. That said I know that some of the ways I have phrased DevOps advantages will rub some people wrong, so please comment where you disagree or think things are mis-characterized.

—Adrian Lane

Thursday, September 24, 2015

Incite 9/23/2015: Friday Night Lights

By Mike Rothman

I didn’t get the whole idea of high school football. When I was in high school, I went to a grand total of zero point zero (0.0) games. It would have interfered with the Strat-o-Matic and D&D parties I did with my friends on Friday listening to Rush. Yeah, I’m not kidding about that.

A few years ago one of the local high school football teams went to the state championship. I went to a few games with my buddy, who was a fan, even though his kids didn’t go to that school. I thought it was kind of weird, but it was a deep playoff run so I tagged along. It was fun going down to the GA Dome to see the state championship. But it was still weird without a kid in the school.

Friday Night Lights

Then XX1 entered high school this year. And the twins started middle school and XX2 is a cheerleader for the 6th grade football team and the Boy socializes with a lot of the players. Evidently the LAX team and the football team can get along. Then they asked if I would take them to the opener at another local school one Friday night a few weeks ago. We didn’t have plans that night, so I was game. It was a crazy environment. I waited for 20 minutes to get a ticket and squeezed into the visitor’s bleachers.

The kids were gone with their friends within a minute of entering the stadium. Evidently parents of tweens and high schoolers are strictly to provide transportation. There will be no hanging out. Thankfully, due to the magic of smartphones, I knew where they were and could communicate when it was time to go.

The game was great. Our team pulled it out with a TD pass in the last minute. It would have been even better if we were there to see it. Turns out we had already left because I wanted to beat traffic. Bad move. The next week we went to the home opener and I didn’t make that mistake again. Our team pulled out the win in the last minute again and due to some savvy parking, I was able to exit the parking lot without much fuss.

It turns out it’s a social scene. I saw some buddies from my neighborhood and got to check in with them, since I don’t really hang out in the neighborhood much anymore. The kids socialized the entire game. And I finally got it. Sure it’s football (and that’s great), but it’s the community experience. Rooting for the high school team. It’s fun.

Do I want to spend every Friday night at a high school game? Uh no. But a couple of times a year it’s fun. And helps pass the time until NFL Sundays. But we’ll get to that in another Incite.


Photo credit: “Punt” originally uploaded by Gerry Dincher

Thanks to everyone who contributed to my Team in Training run to support the battle against blood cancers. We’ve raised almost $6000 so far, which is incredible. I am overwhelmed with gratitude. You can read my story in a recent Incite, and then hopefully contribute (tax-deductible) whatever you can afford. Thank you.

The fine folks at the RSA Conference posted the talk Jennifer Minella and I did on mindfulness at the 2014 conference. You can check it out on YouTube. Take an hour and check it out. Your emails, alerts and Twitter timeline will be there when you get back.

Securosis Firestarter

Have you checked out our new video podcast? Rich, Adrian, and Mike get into a Google Hangout and.. hang out. We talk a bit about security as well. We try to keep these to 15 minutes or less, and usually fail.

Heavy Research

We are back at work on a variety of blog series, so here is a list of the research currently underway. Remember you can get our Heavy Feed via RSS, with our content in all its unabridged glory. And you can get all our research papers too.

Pragmatic Security for Cloud and Hybrid Networks

Building Security into DevOps

Building a Threat Intelligence Program

Network Security Gateway Evolution

Recently Published Papers

Incite 4 U

  1. Monty Python and the Security Grail: Reading Todd Bell’s CSO contribution “How to be a successful CISO without a ‘real’ cybersecurity budget” was enlightening. And by enlightening, I mean WTF? This quite made me shudder: “Over the years, I have learned a very important lesson about cybersecurity; most cybersecurity problems can be solved with architecture changes.” Really? Then he maps out said architecture changes, which involve segmenting every valuable server and using jump boxes for physical separation. And he suggests application layer encryption to protect data at rest. The theory behind the architecture works, but very few can actually implement. I guess this could be done for very specific projects, but across the entire enterprise? Good luck with that. It’s kind of like searching for the Holy Grail. It’s only a flesh wound, I’m sure. Though there is some stuff of value in here. I do agree that fighting the malware game doesn’t make sense and assuming devices are compromised is a good thing. But without a budget, the CISO is pissing into the wind. If the senior team isn’t willing to invest, the CISO can’t be successful. Period. – MR

  2. Everyone knows where you are: A peer review of meta data? Reporter Will Ockenden released his personal ‘metadata’ into the wild and asked the general public for an analysis of his personal habits. This is a fun read! It shows the basics of what can be gleaned with just cell phone data. But it gets far more interesting when you do what every marketing firm and government does – enrichment by adding additional data sources, like web sites, credit card purchases. Then you build a profile of the user; marketing organizations look at what someone might be interested in buying, looking at trends from similar user profiles. Governments look for behavior that denotes risks, and creates a risk score based upon behavior – or outliers of your behavior – and also matches this against the profile of your contacts. It’s the same thing we’ve been doing with security products for the last decade (you know, that security analytics thing), but turned on the general populous. Just as the reviewers of Ockenden’s data found, some of their findings are shockingly accurate. Most people, like Ockenden, get a little creeped out knowing that there are people focusing something akin to invisible cameras on their lives. Once again, McNealy was right all those years ago. Privacy is dead, get over it. – AL

  3. Own it. Learn. Move on.: I love this approach by Etsy of confessing mistakes to the entire company and allowing everyone to learn. Without the stigma of screwing up, employees can try things and innovate. Having a culture of blamelessness is really cool. In security, sharing has always been frowned upon. Practitioner thinks the adversaries will learn how to break into their environment. It turns out the attackers are already in. Threat intelligence is helping to provide a value-add for sharing the information and that’s a start. Increasingly detailed breach notifications given everyone a chance to learn. And that’s what we need as an industry. The ability to learn from each other and improve. Without having to learn everything the hard way. – MR

  4. Targeted Compliance: Target says it’s ready for EMV having made their transition to EMV card enabled devices at the point of sale. What’s more, they’ve taken the more aggressive step in using chip and PIN, and opposed to chip and signature, as that offers better security for the issuing banks. Yes, the issuing banks benefit, not the consumer. But they are marketing this upgrade to consumers with videos to show them how to use EMV ‘chipped’ cards – which need to stay in the card reader for a few seconds, unlike mag stripe cards. I think Target should be congratulated on going straight to chip and PIN, although it’s probably not going yield much loss prevention as most of the chip cards are being issued without a PIN code. But the real question most customers and investors should be asking is “Is Target still passing PAN data from the terminal in the clear?” Yep, just because they’re EMV compliant does not mean that credit card data is being secured with Point to Point Encryption (P2PE). One step forward, one step back. Which leaves us in the same place we started. Sigh. – AL

  5. Lawyers FTW. Cyber-insurance FML. You buy cyber-insurance to cover a breach, right? At least to pay you for the cost of the clean-up. And then your insurer rides a loophole to reject the claim, which basically protects them from having to pay in the case of social engineering. Yup, lawyers are involved and loopholes are found because that’s what insurance companies do. They try to avoid liability and ultimately force the client into legal actual (yes, that’s a pretty cynical view of insurers, but I’ll tell you my healthcare tale of woe sometime as long as you are paying for the drinks…). At some point in 3-4 years some kind of legal precedent regarding whether the insurer is liable will be established. Until then, you are basically rolling the dice. But you don’t have a lot of other options, now do you? – MR

—Mike Rothman

Pragmatic Security for Cloud and Hybrid Networks: Network Security Controls

By Rich

This is the second post in a new series I’m posting for public feedback, licensed by Algosec. Well, that is if they like it – we are sticking to our Totally Transparent Research policy. I’m also live-writing the content on GitHub if you want to provide any feedback or suggestions. Click here for the first post in the series, and here for post two.

Now that we’ve covered the basics of cloud networks, it’s time to focus on the available security controls. Keep in mind that all of this varies between providers and that cloud computing is rapidly evolving and new capabilities are constantly appearing. These fundamentals give you the background to get started, but you will still need to learn the ins and outs of whatever platforms you work with.

What Cloud Providers Give You

Not to sound like a broken record (those round things your parents listened to… no, not the small shiny ones with lasers), but all providers are different. The following options are relatively common across providers, but not necessarily ubiquitous.

  • Perimeter security is traditional network security that the provider totally manages, invisibly to the customers. Firewalls, IPS, etc. are used to protect the provider’s infrastructure. The customer doesn’t control any of it.

    PRO: It’s free, effective, and always there. CON: You don’t control any of it, and it’s only useful for stopping background attacks.

  • Security groups – Think of this is a tag you can apply to a network interface/instance (or certain other cloud objects, like a database or load balancer) that applies an associated set of network security rules. Security groups combine the best of network and host firewalls, since you get policies that can follow individual servers (or even network interfaces) like a host firewall but you manage them like a network firewall and protection is applied no matter what is running inside. You get the granularity of a host firewall with the manageability of a network firewall. These are critical to auto scaling – since you are now spreading your assets all over your virtual network – and, because instances appear and disappear on demand, you can’t rely on IP addresses to build your security rules. Here’s an example: You can create a “database” security group that only allows access to one specific database port and only from instances inside a “web server” security group, and only those web servers in that group can talk to the database servers in that group. Unlike a network firewall the database servers can’t talk to each other since they aren’t in the web server group (remember, the rules get applied on a per-server basis, not a subnet, although some providers support both). As new databases pop up, the right security is applied as long as they have the tag. Unlike host firewalls, you don’t need to log into servers to make changes, everything is much easier to manage. Not all providers use this term, but the concept of security rules as a policy set you can apply to instances is relatively consistent.

    Security groups do vary between providers. Amazon, for example, is default deny and only allows allow rules. Microsoft Azure, however, allows rules that more-closely resemble those of a traditional firewall, with both allow and block options.

    PRO: It’s free and it works hand in hand with auto scaling and default deny. It’s very granular but also very easy to manage. It’s the core of cloud network security. CON: They are usually allow rules only (you can’t explicitly deny), basic firewalling only and you can’t manage them using tools you are already used to.

  • ACLs (Access Control Lists) – While security groups work on a per instance (or object) level, ACLs restrict communications between subnets in your virtual network. Not all providers offer them and they are more to handle legacy network configurations (when you need a restriction that matches what you might have in your existing data center) than “modern” cloud architectures (which typically ignore or avoid them). In some cases you can use them to get around the limitations of security groups, depending on your provider.

    PRO: ACLs can isolate traffic between virtual network segments and can create both allow or deny rules CON: They’re not great for auto scaling and don’t apply to specific instances. You also lose some powerful granularity.

    By default nearly all cloud providers launch your assets with default-deny on all inbound traffic. Some might automatically open a management port from your current location (based on IP address), but that’s about it. Some providers may use the term ACL to describe what we called a security group. Sorry, it’s confusing, but blame the vendors, not your friendly neighborhood analysts.

Commercial Options

There are a number of add-ons you can buy through your cloud provider, or buy and run yourself.

  • Physical security appliances: The provider will provision an old-school piece of hardware to protect your assets. These are mostly just seen in VLAN-based providers and are considered pretty antiquated. They may also be used in private (on premise) clouds where you control and run the network yourself, which is out of scope for this research.

    PRO: They’re expensive, but they’re something you are used to managing. They keep your existing vendor happy? Look, it’s really all cons on this one unless you’re a cloud provider and in that case this paper isn’t for you.

  • Virtual appliances are a virtual machine version of your friendly neighborhood security appliance and must be configured and tuned for the cloud platform you are working on. They can provide more advanced security – such as IPS, WAF, NGFW – than the cloud providers typically offer. They’re also useful for capturing network traffic, which providers tend not to support.

    PRO: They enable more-advanced network security and can manage the same as you do your on-premise versions of the tool. CON: Cost can be a concern, since these use resources like any other virtual server, constrains your architectures and they may not play well with auto scaling and other cloud-native features.

  • Host security agents are software agents you build into your images that run in your instances and provide network security. This could include IDS, IPS or other features that are beyond basic firewalling. We recommend lightweight agents with remote management. The agents (and management platform) need to be designed for use in cloud computing since auto scaling and portability will break traditional tools.

    PRO: Like virtual appliances, host security agents can offer features missing from your provider. With a good management system, they can be extremely flexible and will usually include capabilities beyond network security. They’re a great option for monitoring network traffic. CON: You need to make sure they are installed and run in all your instances and they’re not free. They also won’t work well if you don’t get one that’s designed for the cloud.

A note on monitoring: None of the major providers offer packet level network monitoring and many don’t offer any network monitoring at all. If you need that, consider using host agents or virtual appliances.

To review, your network security controls, no matter what the provider calls them, nearly always fall into 5 buckets:

  • Perimeter security the provider puts in place, that you never see or control.
  • Software firewalls built into the cloud platform (security groups) that protect cloud assets (like instances), offer basic firewalling, and are designed for auto scaling and other cloud-specific uses.
  • Lower-level Access Control Lists for controlling access into, out of, and between the subnets in your virtual cloud network.
  • Virtual appliances to add the expanded features of your familiar network security tools, such as IDS/IPS, WAF, and NGFW.
  • Host security agents to embed in your instances.

Advanced Options on the Horizon We know some niche vendors already offer more-advanced network security built into their platform like IPS, and suspect major vendors will eventually offer similar options. We don’t recommend picking a cloud provider based on these, but it does mean you may get more options in the future.


Tuesday, September 22, 2015

Pragmatic Security for Cloud and Hybrid Networks: Cloud Networking 101

By Rich

This is the second post in a new series I’m posting for public feedback, licensed by Algosec. Well, that is if they like it – we are sticking to our Totally Transparent Research policy. I’m also live-writing the content on GitHub if you want to provide any feedback or suggestions. Click here for the first post in the series.

There isn’t one canonical cloud networking stack out there; each cloud service provider uses their own mix of technologies to wire everything up. Some of these might use known standards, tech, and frameworks, while others might be completely proprietary and so secret that you, as the customer, don’t ever know exactly what is going on under the hood.

Building cloud scale networks is insanely complex, and the different providers clearly see networking capabilities as a competitive differentiator.

So instead of trying to describe all the possible options, we’ll keep things at a relatively high level and focus on common building blocks we see relatively consistently on the different platforms.

Types of Cloud Networks

When you shop providers, cloud networks roughly fit into two buckets:

  • Software Defined Networks (SDN) that fully decouple the virtual network from the underlying physical networking and routing.
  • VLAN-based Networks that still rely on the underlying network for routing, lacking the full customization of an SDN.

Most providers today offer full SDNs of different flavors, so we’ll focus more on those, but we do still encounter some VLAN architectures and need to cover them at a high level.

Software Defined Networks

As we mentioned, Software Defined Networks are a form of virtual networking that (usually) takes advantage of special features in routing hardware to fully abstract the virtual network you see from the underlying physical network. To your instance (virtual server) everything looks like a normal network. But instead of connecting to a normal network interface it connects to a virtual network interface which handles everything in software.

SDNs don’t work the same as a physical network (or even an older virtual network). For example, in an SDN you can create two networks that use the same address spaces and run on the same physical hardware but never see each other. You can create an entirely new subnet not by adding hardware but with a single API call that “creates” the subnet in software.

How do they work? Ask your cloud provider. Amazon Web Services, for example, intercepts every packet, wraps it and tags it, and uses a custom mapping service to figure out where to actually send the packet over the physical network with multiple security checks to ensure no customer ever sees someone else’s packet. (You can watch a video with great details at this link). Your instance never sees the real network and AWS skips a lot of the normal networking (like ARP requests/caching) within the SDN itself.

SDN allows you to take all your networking hardware, abstract it, pool it together, and then allocate it however you want. On some cloud providers, for example, you can allocate an entire class B network with multiple subnets, routed to the Internet behind NAT, in just a few minutes or less. Different cloud providers use different underlying technologies and further complicate things since they all offer different ways of managing the network.

Why make things so complicated? Actually, it makes management of your cloud network much easier, while allowing cloud providers to give customers a ton of flexibility to craft the virtual networks they need for different situations. The providers do the heavy lifting, and you, as the consumer, work in a simplified environment. Plus, it handles issues unique to cloud, like provisioning network resources faster than existing hardware can handle configuration changes (a very real problem), or multiple customers needing the same private IP address ranges to better integrate with their existing applications.

Virtual LANs (VLANs)

Although they do not offer the same flexibility as SDNs, a few providers still rely on VLANS. Customers must evaluate their own needs, but VLAN-based cloud services should be considered outdated compared to SDN-based cloud services.

VLANs let you create segmentation on the network and can isolate and filter traffic, in effect just cutting off your own slice of the existing network rather than creating your own virtual environment. This means you can’t do SDN-level things like creating two networks on the same hardware with the same address range.

  • VLANs don’t offer the same flexibility. You can create segmentation on the network and isolate and filter traffic, but can’t do SDN-level things like create two networks on the same hardware with the same address range.
  • VLANs are built into standard networking hardware, which is why that’s where many people used to start. No special software needed.
  • Customers don’t get to control their addresses and routing very well
  • They can’t be trusted for security segmentation.

Because VLANs are built into standard networking hardware, they used to be where most people started when creating cloud computing as no special software was required. But customers on VLANs don’t get to control their addresses and routing very well, and they scale and perform terribly when you plop a cloud on top of them. They are mostly being phased out of cloud computing due to these limitations.

Defining and Managing Cloud Networks

While we like to think of one big cloud out there, there is more than one kind of cloud network and several technologies that support them. Each provides different features and presents different customization options. Management can also vary between vendors, but there are certain basic characteristics that they exhibit. Different providers use different terminology, so we’ve tried out best to pick ones that will make sense once you look at particular offerings.

Cloud Network Architectures

An understanding of the types of cloud network architectures and the different technologies that enable them is essential to fitting your needs with the right solution.

There are two basic types of cloud network architectures.

  • Public cloud networks are Internet facing. You connect to your instances/servers via the public Internet and no special routing needed; every instance has a public IP address.
  • Private cloud networks (sometimes called “virtual private cloud”) use private IP addresses like you would use on a LAN. You have to have a back-end connection — like a VPN — to connect to your instances. Most providers allow you to pick your address ranges so you can use these private networks as an extension of your existing network. If you need to bridge traffic to the Internet, you route it back through your data center or you use Network Address Translation to a public network segment, similarly to how home networks use NAT to bridge to the Internet.

These are enabled and supported by the following technologies.

  • Internet connectivity (Internet Gateway) which hooks your cloud network to the Internet. You don’t tend to directly manage it, your cloud provider does it for you.
  • Internal Gateways/connectivity connect your existing datacenter to your private network in the cloud. These are often VPN based, but instead of managing the VPN server yourself, the cloud provider handles it (you just manage the configuration). Some providers also support direct connections through partner broadband network providers that will route directly between your data center and the private cloud network, instead of using a VPN (which are on leased lines).
  • Virtual Private Networks - Instead of using the cloud provider’s, you can always set up your own, assuming you can bridge the private and public networks in the cloud provider. This kind of setup is very common, especially if you don’t want to directly connect your data center and cloud, but still want a private segment and allow access to it for your users, developers and administrators.

Cloud providers all break up their physical infrastructure differently. Typically they have different data centers (which might be a collection of multiple data centers clumped together) in different regions. A region or location is the physical location of the data center(s), while a zone is a sub-section of that region used for designing availability. These are for:

  • Performance - By allowing you to take advantage of physical proximity, you can improve performance of applications that conduct high levels of traffic.
  • Regulatory requirements - Flexibility in the geographic location of your data stores can help meet local legal and regulatory requirements around data residency.
  • Disaster recovery and maintaining availability - Most providers charge for some or all network traffic if you communicate across regions and locations, which would make disaster recovery expensive. That’s why they provide local “zones” that break out an individual region into isolated pieces with their own network, power, and so forth. A problem might take out one zone in a region, but shouldn’t take out any others, giving customers a way to build for resiliency without having to span continents or oceans. Plus, you don’t tend to pay for the local network traffic between zones.

Managing Cloud Networks

Managing these networks depends on all of the components listed above. Each vendor will have its own set of tools based on certain general principles.

  • Everything is managed via APIs, which are typically REST (representational state transfer)-based.
  • You can fully define and change everything remotely via these APIs and it happens nearly instantly in most cases.
  • Cloud platforms also have web UIs, which are simply front ends for the same APIs you might code to but tend to automate a lot of the heavy lifting for you.
  • Key for security is protecting these management interfaces since someone can otherwise completely reconfigure your network while sitting at a hipster coffee shop, making them, by definition, evil (you can usually spot them by the ski masks, according to our clip art library).

Hybrid Cloud Architectures

As mentioned, your data center may be connected to the cloud. Why? Sometimes you need more resources and you don’t want them on the public Internet. This is a common practice for established companies that aren’t starting from scratch and need to mix and match resources.

There are two ways to accomplish this.

  • VPN connections - You connect to the cloud via a dedicated VPN, which is nearly always hardware-based and hooked into your local routers to span traffic to the cloud. The cloud provider, as mentioned, handles their side of the VPN, but you still have to configure some of it. All traffic goes over the Internet but is isolated.
  • Direct network connections - These are typically set up over leased lines. They aren’t necessarily more secure and are much more expensive but they can reduce latency, or make your router-hugging network manager feel happy.

Routing Challenges

While cloud services can provide remarkable flexibility, they also require plenty of customization and present their own challenges for security.

Nearly every Infrastructure as a Service provider supports auto scaling, which is one of the single most important features at the core of the benefits of cloud computing. You can define your own rules in your cloud for when to add or remove instances of a server. For example, you can set a rule that says to add servers when you hit 80 percent CPU load. It can then terminate those instances when load drops (clearly you need to architect appropriately for this kind of behavior).

This creates application elasticity since your resources can automatically adapt based on demand instead of having to leave servers running all the time just in case demand increases. Your consumption now aligns with demand, instead of traditional architectures, which leave a lot of hardware sitting around, unused, until demand is high enough. This is the heart of IaaS. This is what you’re paying for.

Such flexibility creates complexity. If you think about it, you won’t necessarily know the exact IP address of all your servers since they may appear and disappear within minutes. You may even design in complexity when you design for availability — by creating rules to keep multiple instances in multiple subnets across multiple zones available in case one of them drops out. Within those virtual subnets, you might have multiple different types of instances with different security requirements. This is pretty common in cloud computing.

Fewer static routes, highly dynamic addressing and servers that might only “live” for less than an hour… all this challenges security. It requires new ways of thinking, which is what the rest of this paper will focus on.

Our goal here is to start getting you comfortable with how different cloud networks can be. On the surface, depending on your provider, you may still be managing subnets, routing tables, and ACLs. But underneath, these are now (probably) database entries implemented in software, not the hardware you might be used to.


Wednesday, September 16, 2015

Pragmatic Security for Cloud and Hybrid Networks: Introduction

By Rich

This is the start in a new series I’m posting for public feedback, licensed by Algosec. Well, that is if they like it – we are sticking to our Totally Transparent Research policy. I’m also live-writing the content on GitHub if you want to provide any feedback or suggestions. With that, here’s the content…

For a few decades we have been refining our approach to network security. Find the boxes, find the wires connecting them, drop a few security boxes between them in the right spots, and move on. Sure, we continue to advance the state of the art in exactly what those security boxes do, and we constantly improve how we design networks and plug everything together, but overall change has been incremental. How we think about network security doesn’t change – just some of the particulars.

Until you move to the cloud.

While many of the fundamentals still apply, cloud computing releases us from the physical limitations of those boxes and wires by fully abstracting the network from the underlying resources. We move into entirely virtual networks, controlled by software and APIs, with very different rules. Things may look the same on the surface, but dig a little deeper and you quickly realize that network security for cloud computing requires a different mindset, different tools, and new fundamentals.

Many of which change every time you switch cloud providers.

The challenge of cloud computing and network security

Cloud networks don’t run magically on pixie dust, rainbows, and unicorns – they rely on the same old physical network components we are used to. The key difference is that cloud customers never access the ‘real’ network or hardware. Instead they work inside virtual constructs – that’s the nature of the cloud.

Cloud computing uses virtual networks by default. The network your servers and resources see is abstracted from the underlying physical resources. When you server gets IP address, that isn’t really that IP address on the routing hardware – it’s a virtual IP address on a virtual network. Everything is handled in software, and most of these virtual networks are Software Defined Networks (SDN). We will go over SDN in more depth in the next section.

These networks vary across cloud providers, but they are all fundamentally different from traditional networks in a few key ways:

  • Virtual networks don’t provide the same visibility as physical networks because packets don’t move around the same way. We can’t plug a wire into the network to grab all the traffic – there is no location all traffic traverses, and much of the traffic is wrapped and encrypted anyway.
  • Cloud networks are managed via Application Programming Interfaces – not by logging in and provisioning hardware the old-fashioned way. A developer has the power to stand up an entire class B network, completely destroy an entire subnet, or add a network interface to a server and bridge to an entirely different subnet on a different cloud account, all within minutes with a few API calls.
  • Cloud networks change faster than physical networks, and constantly. It isn’t unusual for a cloud application to launch and destroy dozens of servers in under an hour – faster than traditional security and network tools can track – or even build and destroy entire networks just for testing.
  • Cloud networks look like traditional networks, but aren’t. Cloud providers tend to give you things that look like routing tables and firewalls, but don’t work quite like your normal routing tables and firewalls. It is important to know the differences.

Don’t worry – the differences make a lot of sense once you start digging in, and most of them provide better security that’s more accessible than on a physical network, so long as you know how to manage them.

The role of hybrid networks

A hybrid network bridges your existing network into your cloud provider. If, for example, you want to connect a cloud application to your existing database, you can connect your physical network to the virtual network in your cloud.

Hybrid networks are extremely common, especially as traditional enterprises begin migrating to cloud computing and need to mix and match resources instead of building everything from scratch. One popular example is setting up big data analytics in your cloud provider, where you only pay for processing and storage time, so you don’t need to buy a bunch of servers you will only use once a quarter.

But hybrid networks complicate management, both in your data center and in the cloud. Each side uses a different basic configuration and security controls, so the challenge is to maintain consistency across both, even though the tools you use – such as your nifty next generation firewall – might not work the same (if at all) in both environments.

This paper will explain how cloud network security is different, and how to pragmatically manage it for both pure cloud and hybrid cloud networks. We will start with some background material and cloud networking 101, then move into cloud network security controls, and specific recommendations on how to use them. It is written for readers with a basic background in networking, but if you made it this far you’ll be fine.


Monday, September 14, 2015

Building Security into DevOps [New Series]

By Adrian Lane

I have been in and around software development my entire professional career. As a new engineer, as an architect, and later as the guy responsible for the whole show. And I have seen as many failed software deliveries – late, low quality, off-target, etc. – as successes. Human dysfunction and miscommunication seem to creep in everywhere, and Murphy’s Law is in full effect. Getting engineers to deliver code on time was just one dimension of the problem – the interaction between development and QA was another, and how they could both barely contain their contempt for IT was yet another. Low-quality software and badly managed deployments make productivity go backwards. Worse, repeat failures and lack of reliability create tension and distrust between all the groups in a company, to the point where they become rival factions. Groups of otherwise happy, well-educated, and well-paid people can squabble like a group of dysfunctional family members during a holiday get-together.

Your own organizational dysfunction can have a paralytic effect, dropping productivity to nil. Most people are so entrenched in traditional software development approaches that it’s hard to see development ever getting better. And when firms talk about deploying code every day instead of every year, or being fully patched within hours, or detection and recovery from a bug within minutes, most developers scoff at these notion as pure utopian fantasy. That is, until they see these things in action – then their jaws drop.

With great interest I have been watching and participating in the DevOps approach to software delivery. So many organizational issues I’ve experienced can be addressed with DevOps approaches. So often it has seemed like IT infrastructure and tools worked against us, not for us, and now DevOps helps address these problems. And Security? It’s no longer the first casualty of the war for new features and functions – instead it becomes systemized in the delivery process. These are the reasons we expect DevOps to be significant for most software development teams in the future, and to advance security testing within application development teams far beyond where it’s stuck today. So we are kicking off a new series: Building Security into DevOps – focused not on implementation of DevOps – there are plenty of other places you can find those details – but instead on the security integration and automation aspects. To be clear, we will cover some basics, but our focus will be on security testing in the development and deployment cycle.

For readers new to the concept, what is DevOps? It is an operational framework that promotes software consistency and standardization through automation. Its focus is on using automation to do a lot of the heavy lifting of building, testing, and deployment. Scripts build organizational memory into automated processes to reduce human error and force consistency. DevOps helps address many of the nightmare development issues around integration, testing, patching, and deployment – by both breaking down the barriers between different development teams, and also prioritizing things that make software development faster and easier. Better still, DevOps offers many opportunities to integrate security tools and testing directly into processes, and enables security to have equal focus with new feature development.

That said, security integrates with DevOps only to the extent that development teams build it in. Automated security testing, just like automated application building and deployment, must be factored in along with the rest of the infrastructure.

And that’s the problem. Software developers traditionally do not embrace security. It’s not because they do not care about security – but historically they have been incentivized to to focus on delivery of new features and functions. Security tools don’t easily integrate with classic development tools and processes, often flood development task queues with unintelligible findings, and lack development-centric filters to help developers prioritize. Worse, security platforms and the security professionals who recommended them have been difficult to work with – often failing to offer API-layer integration support.

The pain of security testing, and the problem of security controls being outside the domain of developers and IT staff, can be mitigated with DevOps. This paper will help Security integrate into DevOps to ensure applications are deployed only after security checks are in place and applications have been vetted. We will discuss how automation and DevOps concepts allow for faster development with integrated security testing, and enable security practitioners to participate in delivery of security controls. Speed and agility are available to both teams, helping to detect security issues earlier, with faster recovery times. This series will cover:

  • The Inexorable Emergence of DevOps: DevOps is one of the most disruptive trends to hit development and deployment of applications. This section will explain how and why. We will cover some of the problems it solves, how it impacts the organization as a whole, and its impact on SDLC.
  • The Role of Security in DevOps: Here we will discuss security’s role in the DevOps framework. We’ll cover how people and technology become part of the process, and how they can contribute to DevOps to improve the process.
  • Integrating Security into DevOps: Here we outline DevOps and show how to integrate security testing into the DevOps operational cycle. To provide a frame of reference we will walk through the facets of a secure software development lifecycle, show where security integrates with day-to-day operations, and discuss how DevOps opens up new opportunities to deliver more secure software than traditional models. We will cover the changes that enable security to blend into the framework, as well as Rugged Software concepts and how to design for failure.
  • Tools and Testing in Detail: As in our other secure software development papers, we will discuss the value of specific types of security tools which facilitate the creation of secure software and how they fit within the operational model. We will discuss some changes required to automate and integrate these tests within build and deployment processes.
  • The New Agile: DevOps in Action: We will close this research series with a look at DevOps in action, what to automate, a sample framework to illustrate continuous integration and validation, and the meaning of software defined security.

Once again, we encourage your input – perhaps more than for our other recent research series. We are still going through interviews, and we have not been surprised to hear that many firms we speak with are just now working on continuous integration. Continuous deployment and DevOps are the vision, but many organizations are not there yet. If you are on this journey and would like to comment, please let us know – we would love to speak with you about your experiences. Your input makes our research better, so reach out if you’d like to participate.

Next up: The Inexorable Emergence of DevOps

—Adrian Lane

Friday, September 04, 2015

EMV Migration and the Changing Payments Landscape [New Paper]

By Adrian Lane

With the upcoming EMV transition deadline for merchants fast approaching, we decided to take an in-depth look at what this migration is all about – and particularly whether it is really in merchants’ best interests to adopt EMV. We thought it would be a quick, straightforward set of conversations. We were wrong.

On occasion these research projects surprise us. None more so than this one. These conversations were some of the most frank and open we have had at Securosis. Each time we vetted a controversial opinion with other sources, we learned something else new along the way. It wasn’t just that we heard different perspectives – we got an earful on every gripe, complaint, irritant, and oxymoron in the industry. We also developed a real breakdown of how each stakeholder in the payment industry makes its money, and when EMV would change things. We got a deep education on what each of the various stakeholders in the industry really thinks this EMV shift means, and what they see behind the scenes – both good and bad. When you piece it all together, the pattern that emerges is pretty cool!

It’s only when you look beyond the terminal migration, and examine the long term implications, does the value proposition become clear. During our research, as we dug into less advertised systemic advances in the full EMV specification for terminals and tokenization, did we realize this migration is more about meeting future customer needs than a short-term fraud or liability problem. The migration is intended to bring payment into the future, and includes a wealth of advantages for merchants, which are delivered with minimal to no operational disruption

And as we are airing a bit of dirty laundry – anonymously, but to underscore points in the research – we understand this research will be controversial. Most stakeholders will have problems with some of the content, which is why when we finished the project, we were fairly certain nobody in the industry would touch this research with a 20’ pole. We attempted to fairly represent all sides in the debates around the EMV rollout, and to objectively explain the benefits and deficits. When you put it all together, we think this paints a good picture of where the industry as a whole is going. And from our perspective, it’s all for the better!

Here’s a link directly to the paper, and to its landing page in our research library.

We hope you enjoy reading it as much as we enjoyed writing it!

—Adrian Lane

Wednesday, August 26, 2015

Incite 8/26/2015: Epic Weekend

By Mike Rothman

Sometimes I have a weekend when I am just amazed. Amazed at the fun I had. Amazed at the connections I developed. And I’m aware enough to be overcome with gratitude for how fortunate I am. A few weekends ago I had one of those experiences. It was awesome.

It started on a Thursday. After a whirlwind trip to the West Coast to help a client out with a short-term situation (I was out there for 18 hours), I grabbed a drink with a friend of a friend. We ended up talking for 5 hours and closing down the bar/restaurant. At one point we had to order some food because they were about to close the kitchen. It’s so cool to make new friends and learn about interesting people with diverse experiences.

The following day I got a ton of work done and then took XX1 to the first Falcons pre-season game. Even though it was only a pre-season game it was great to be back in the Georgia Dome. But it was even better to get a few hours with my big girl. She’s almost 15 now and she’ll be driving soon enough (Crap!), so I know she’ll prioritize spending time with her friends in the near term, and then she’ll be off to chase her own windmills. So I make sure to savor every minute I get with her.

On Saturday I took the twins to Six Flags. We rode roller coasters. All. Day. 7 rides on 6 different coasters (we did the Superman ride twice). XX2 has always been fearless and willing to ride any coaster at any time. I don’t think I’ve seen her happier than when she was tall enough to ride a big coaster for the first time. What’s new is the Boy. In April I forced him onto a big coaster up in New Jersey. He wasn’t a fan. But something shifted over the summer, and now he’s the first one to run up and get in line. Nothing makes me happier than to hear him screaming out F-bombs as we careen down the first drop. That’s truly my happy place.

If that wasn’t enough, I had to be on the West Coast (again) Tuesday of the following week, so I burned some miles and hotel points for a little detour to Denver to catch both Foo Fighters shows. I had a lot of work to do, so the only socializing I did was in the pit at the shows (sorry Denver peeps). But the concerts were incredible, I had good seats, and it was a great experience.

in the pit

So my epic weekend was epic. And best of all, I was very conscious that not a lot of people get to do these kinds of things. I was so appreciative of where I am in life. That I have my health, my kids want spend time with me, and they enjoy doing the same things I do. The fact that I have a job that affords me the ability to travel and see very cool parts of the world is not lost on me either. I guess when I bust out a favorite saying of mine, “Abundance begins with gratitude,” I’m trying to live that every day.

I realize how lucky I am. And I do not take it for granted. Not for one second.


Photo credit: In the pit picture by MSR, taken 8/17/2015

Thanks to everyone who contributed to my Team in Training run to support the battle against blood cancers. We’ve raised almost $6000 so far, which is incredible. I am overwhelmed with gratitude. You can read my story in a recent Incite, and then hopefully contribute (tax-deductible) whatever you can afford. Thank you.

The fine folks at the RSA Conference posted the talk Jennifer Minella and I did on mindfulness at the 2014 conference. You can check it out on YouTube. Take an hour and check it out. Your emails, alerts and Twitter timeline will be there when you get back.

Securosis Firestarter

Have you checked out our new video podcast? Rich, Adrian, and Mike get into a Google Hangout and… hang out. We talk a bit about security as well. We try to keep these to 15 minutes or less, and usually fail.

Heavy Research

We are back at work on a variety of blog series, so here is a list of the research currently underway. Remember you can get our Heavy Feed via RSS, with our content in all its unabridged glory. And you can get all our research papers too.

Building a Threat Intelligence Program

EMV and the Changing Payment Space

Network Security Gateway Evolution

Recently Published Papers

Incite 4 U

  1. Can ‘em: If you want better software quality, fire your QA team – that’s what one of Forrester’s clients told Mike Gualtieri. That tracks to what we have been seeing from other firms, specifically when the QA team is mired in an old way of doing things and won’t work with developers to write test scripts and integrate them into the build process. This is one of the key points we learned earlier this year on the failure of documentation, where firms moving to Agile were failing as their QA teams insisted on hundreds of pages of specifications for how and what to test. That’s the opposite of Agile and no bueno! Steven Maguire hit on this topic back in January when he discussed documentation and communication making QA a major impediment in moving to more Agile – and more automated – testing processes. Software development is undergoing a radical transformation, with restful APIs, DevOps principles, and cloud & virtualization technologies enabling far greater agility and efficiency than ever before. And if you’re in IT or Operations, take note, because these disruptive changes will hit you as well. Upside the head. – AL

  2. Security technologies never really die… Sometimes you read an article and can’t tell if the writer is just trolling you. I got that distinct feeling reading Roger Grimes’ 10 security technologies destined for the dustbin. Some are pretty predictable (SSL being displaced by TLS, IPSec), which is to be expected. And obvious, like calling for AV scanners to go away, although claiming they will die in the wake of a whitelisting revolution is curious. Others are just wrong. He predicts the demise of firewalls because of an increasing amount of encrypted traffic. Uh, no. You’ll have to deal with the encrypted traffic, but access control on the network (which is what a firewall does) are here to stay. He says anti-spam will go away because high-assurance identities will allow us to blacklist spammers. Uh huh. Another good one is that you’ll no longer collect huge event logs. I don’t think his point is that you won’t collect any logs, but that vendors will make them more useful. What about compliance? And forensics? Those require more granular data collection. It’s interesting to read these thoughts, but if he bats .400 I’ll be surprised. – MR

  3. Don’t cross the streams In a recent post on Where do PCI-DSS and PII Intersect?, Infosec Institute makes a case for dealing with PII under the same set of controls used for PCI-DSS V3. We take a bit of a different approach: Decide whether you need the data, and if not use a surrogate like masking or tokenization – maybe even get rid of the data entirely. It’s hard to steal what you don’t have. Just because you’ve tokenized PAN data (CCs) does not mean you can do the same with PII – it depends on how the data is used. Including PII in PAN data reports is likely to confuse auditors and make things more complicated. And if you’re using encryption or dynamic masking, it will take work to apply it to different data sets. The good news is that if you are required to comply with PCI-DSS, you have likely already invested in security products and staff with experience in dealing with sensitive data. You need to figure out how to handle data security, understanding that what you do for PII will likely differ from what you do in-scope PCI data because the use cases are different. – AL

  4. Applying DevOps to Security Our pal Andrew Storms offers a good selection of ideas on how to take lessons learned in DevOps and apply them to security on the ITProPortal. His points about getting everyone on board and working in iterations hit home. Those are prominent topics as we work with clients to secure their newfangled continuous deployment environments. He also has a good list of principles we should be following anyway, such as encrypting everything (where feasible), planning for failure, and automating everything. These new development and operational models are going to take root sooner rather than later. If you want a head start on where your career is going, start reading stuff like this now. – MR

—Mike Rothman

Monday, August 17, 2015

Applied Threat Intelligence [New Paper]

By Mike Rothman


Threat Intelligence remains one of the hottest areas in security. With its promise to help organizations take advantage of information sharing, early results have been encouraging. We have researched Threat Intelligence deeply; focusing on where to get TI and the differences between gathering data from networks, endpoints, and general Internet sources. But we come back to the fact that having data is not enough – not now and not in the future.

It is easy to buy data but hard to take full advantage of it. Knowing what attacks may be coming at you doesn’t help if your security operations functions cannot detect the patterns, block the attacks, or use the data to investigate possible compromise. Without those capabilities it’s all just more useless data, and you already have plenty of that.

Our Applied Threat Intelligence paper focuses on how to actually use intelligence to solve three common use cases: preventative controls, security monitoring, and incident response. We start with a discussion of what TI is and isn’t, where to get it, and what you need to deal with specific adversaries. Then we dive into use cases.


We would like to thank Intel Security for licensing the content in this paper. Our licensees enable us to provide our research at no cost to you, so we should all thank them. As always, we developed this paper using our objective Totally Transparent Research methodology.

Visit the Applied Threat Intelligence landing page in our research library, or download the paper directly (PDF).

—Mike Rothman