Building Security Into DevOps: Security Integration PointsBy Adrian Lane
A couple housekeeping items before I begin today’s post - we’ve had a couple issues with the site so I apologize if you’ve tried to leave comments but could not. We think we have that fixed. Ping us if you have trouble.
Also, I am very happy to announce that Veracode has asked to license this research series on integrating security into DevOps! We are very happy to have them onboard for this one. And it’s support from the community and industry that allows us to bring you this type of research, and all for free and without registration.
For the sake of continuity I’ve decided to swap the order of posts from our original outline. Rather than discuss the role of security folks in a DevOps team, I am going to examine integration of security into code delivery processes. I think it will make more sense, especially for those new to DevOps, to understand the technical flow and how things fit together before getting a handle on their role.
Remember that DevOps is about joining Development and Operations to provide business value. The mechanics of this are incredibly important as it helps explain how the two teams work together, and that is what I am going to cover today.
Most of you reading this will be familiar with the concept of ‘nightly builds’, where all code checked in the previous day would be compiled overnight. And you’re just as familiar with the morning ritual of sipping coffee while you read through the logs to see if the build failed, and why. Most development teams have been doing this for a decade or more. The automated build is the first of many steps that companies go through on their way towards full automation of the processes that support code development. The path to DevOps is typically done in two phases: First with continuous integration, which manages the building an testing of code, and then continuous deployment, which assembles the entire application stack into an executable environment.
The essence of Continuous Integration (CI) is where developers check in small iterative advancements to code on a regular basis. For most teams this will involve many updates to the shared source code repository, and one or more ‘builds’ each day. The core idea is smaller, simpler additions where we can more easily - and more often - find defects in the code. Essentially these are Agile concepts, but implemented in processes that drive code instead of processes that drive people (e.g.: scrums, sprints). Definition of CI has morphed slightly over the last decade, but in context to DevOps, CI also implies that code is not only built and integrated with supporting libraries, but also automatically dispatched for testing as well. And finally CI in a DevOps context also implies that code modifications will not be applied to a branch, but into the main body of the code, reducing complexity and integration nightmares that plague development teams.
Conceptually, this sounds simple, but in practice it requires a lot of supporting infrastructure. It means builds are fully scripted, and the build process occurs as code changes are made. It means upon a successful build, the application stack is bundled and passed along for testing. It means that test code is built prior to unit, functional, regression and security testing, and these tests commence automatically when a new bundle is available. It also means, before tests can be launched, that test systems are automatically provisioned, configured and seeded with the necessary data. And these automation scripts must provide monitored for each part of the process, and that the communication of success or failure is sent back to Dev and Operations teams as events occur. The creation of the scripts and tools to make all this possible means operations, testing and development teams to work closely together. And this orchestration does not happen overnight; it’s commonly an evolutionary process that takes months to get the basics in place, and years to mature.
Continuous Deployment looks very similar to CI, but is focused on the release – as opposed to build – of software to end users. It involves a similar set of packaging, testing, and monitoring, but with some additional wrinkles. The following graphic was created by Rich Mogull to show both the flow of code, from check-in to deployment, and many of the tools that provide automation support.
Upon a successful completion of a CI cycle, the results feed the Continuous Deployment (CD) process. And CD takes another giant step forward in terms of automation and resiliency. CD continues the theme of building in tools and infrastructure that make development better _first, and functions second. CD addresses dozens of issues that plague code deployments, specifically error prone manual changes and differences in revisions of supporting libraries between production and dev. But perhaps most important is the use of the code and infrastructure to control deployments and rollback in the event of errors. We’ll go into more detail in the following sections.
This is far from a complete description, but hopefully you get enough of the basic idea of how it works. With the basic mechanics of DevOps in mind, let’s now map security in. The differences between what you do today should stand in stark contrast to what you do with DevOps.
Security Integration From An SDLC Perspective
Secure Development Lifecycle’s (SDLC), or sometimes called Secure Software Development Lifecycle’s, describe different functions within software development. Most people look at the different phases in an SDLC and think ‘Waterfall Development process’, which makes discussing SDLC in conjunction with DevOps seem convoluted. But there are good reasons for doing this; Architecture, design, development, testing and deployment phases of an SDLC map well to roles in the development organization regardless of development process, and they provide a jump-off point for people to take what they know today and morph that into a DevOps framework.
- Operational standards: Typically in the early phases of software development, you’re focused on the big picture of application architecture and how large functional pieces will work. With DevOps, you’re also weaving in the operational standards for the underlying environment. Just as with the code you deploy, you want to make small iterative improvements every day with your operational environment. This will include updates to the infrastructure (_e.g.: build automation tools, CI tools), but also policies for application stack security, including how patches are incorporated, version synchronization over the entire build chain, leveraging of tools and metrics, configuration management and testing. These standards will form the stories which are sent to the operations team for scripting during the development phase discussed below.
- Security functional requirements: What security tests will you run, which need to be run prior to deployment, and what tools are you going to use to get there. At a minimum, you’ll want to set security requirements for all new code, and what development team will need to test prior to certification. This could mean a battery of unit tests for specific threats your team must write to check for – as an example – the OWASP top ten list of vulnerabilities. Or you may choose commercial products: You have a myriad of security tools at your disposal and not all of them have APIs or capability to be fully integrated into DevOps. Similarly, many tests do not run as fast as your deployment cycle, so you have some difficult decision to make - more on parallel security testing below.
- Monitoring and metrics: If you’re going to make small iterative improvements with each release, what needs fixing? What’s going to slow? What is working and how do you prove it? Metrics are key to answering these questions. You will need to think about what data you want to collect, and build this into the CI and CD environment to measure how your scripts and testing perform. You’ll continually evolve the collection and use of metrics, but basic collection and dissemination of data should be in your plan from the get-go.
- Secure design/architecture: DevOps provides a means for significant advancements in security design and architecture. Most notably, since you’re goal is to automate patching and configurations for deployment, it’s possible to entirely disable administrative connections to production servers. Errors and misconfigurations are fixed in build and automation scripts, not through manual logins. Configuration, automated injection of certificates, automated patching and even pre-deployment validation are all possible. It’s also possible to completely disable network ports and access points commonly used for administration purposes, or what is a common attack vector. Leveraging deployment APIs form PaaS and IaaS cloud services, you have even more automation choices, which we will discuss later in this paper. DevOps offers a huge improvement to basic system security, but you specifically must design – or re-design – your deployments to leverage the advantages what automated CI and CD provide.
- Secure the deployment pipeline: With greater control over the both development and production environments, development and test servers become a more attractive target. Traditionally these environments are run with little or no security. But there is a greater need for security of source code management, build servers and deployment pipeline given the can possibly feed directly - and with minimal human intervention - directly into production. You’ll need to employ stricter controls over access to these systems, specifically build servers and code management. And given less human oversight of scripts running continuously in the background, you’ll need to ensure added monitoring so errors and misuse can be detected and corrected.
- Threat model: We maintain that threat modeling is one of the most productive exercises in security. DevOps does not change that. It does however open up opportunities for security team members to both instruct dev team members on common threat types, as well as help them plan unit tests for these type of attacks.
- Infrastructure and Automation First: You need tools before you can build a house, and you need a road before you drive a car somewhere. With DevOps, and specifically security within DevOps, integrating tools and building the tests are done before you begin developing the next set of features. We stress this point both because it makes planning more important, and it helps development plan for tools and test it needs to deploy before they can deliver new code. The bad news is that there is up front cost and work to be done; the good news is that each and every build now leverages the infrastructure and tools you’ve built.
- Automated and Validated: Remember, it’s not just development that is writing code and building scripts; operations is now up to their elbows in it as well. This is how DevOps helps the fields of patching and hardening to a new level. IT’s role in DevOps is to provide build scripts that build out the infrastructure needed for development, testing and production servers. The good news is that what works in testing should be identical to production. And automation help eliminate the problem traditional IT has faced for years: ad-hoc and undocumented work that runs months, or even years, behind on patching. Again, there is a lot of work to get this fully automated; servers, network configuration, applications and so on. Most teams we spoke with build new machine images every week, and update the scripts which apply patches, updating configurations and build scripts for different environments. But the work ensures consistency and secure baseline from which to start.
- Security Tasks: A core tenant of Continuous Integration is never check in broken or un-tested code. What constitutes broken or un-tested is up to you. Keep in mind that what’s happening here is, rather than write giant specification documents for code quality or security – like you used to do for waterfall – you’re documenting policies in functional scripts and programs. Unit tests and functional tests not only define, but enforce, security requirements.
- Security in the Scrum: As we mentioned in the last section, DevOps is process neutral. You can use spiral, or Agile, or surgical-team approaches as you wish. That said, the use of Agile Scrums and Kanban techniques are ideally suited for use with DevOps. The focus on smaller, focused, quickly demonstrable tasks are in natural alignment. For security tasks, they are no less important than any other structural or feature improvement. We recommend training at least one person on each team on security basics, and determine which team members have an interest in security topics to build in-house expertise. In this way security tasks can easily be distributed to those members who have interest and skill tackling security related problems.
- Strive for failure: In many ways DevOps turns long held principles - in both IT and software development - upside down. Durability used to mean ‘uptime’, and now it’s the speed of replacement. Detailed specifications were used to coordinate dev teams, now it’s a post-it notes. Quality assurance focused on getting code to pass functional requirements, now it looks for ways to break an application before someone else can. It’s this latter point change in approach which really helps raise the bar on security. Stealing a line from James Wickett’s Gauntlt page: “Be Mean To Your Code - And Like It” embodies the ideal. Not only is the goal to build security tests into the automated delivery process, but to greatly raise the bar on what is acceptable to release. We harden an application by intentionally pummeling it with all sorts of functional, stress and security tests before code goes live, reducing the time of hands-on security experts testing code. If you can figure out some way to break your application, odds are attacker can too, so build the test – and the remedy – before code goes live.
- Parallelize Security Testing: A problem with all agile development approaches is what to do about tests that take longer than the development cycle? For example, we know that fuzz testing critical pieces of code takes longer in duration than you average sprint within an Agile development model. DevOps is no different in this regard; with CI and CD, as code may delivered to users within hours of being created, it is simply not possible to perform white-box or dynamic code scans during this window of time. To help address this issue, DevOps teams run multiple security tests in parallel. Validation against known critical issues are written as unit tests to perform a quick spot check, with failures kicking code back to the development team. Code scanners are commonly run in parallel, against periodic – instead of every – release. The results are also sent back to development, and similarly identify the changes that created the vulnerability, but these tests commonly do not gate a release. How to deal with these issues caused headaches in every dev team we spoke with. Focusing scans on specific areas of the code helps find issues faster, and minimizes the disruption from lagging tests, but remains an area security and development team members struggle with.
- Manual vs. Automated deployment: It’s easy enough to push new code into production. Vetting that code, or rolling back in the events of errors is always tricky. Most teams we spoke with are not at the stage where they are fully comfortable with fully automated deployments. In fact many still only release new code to their customers every few weeks, often in conjunction with the end of a recent sprint. For these companies most actions are executed through scripts, but the scripts are run manually, when IT and development resources can be on hand to fully monitor the code push. A handful of organizations are fully comfortable with fully-automated pushes to production, and release code several times a day. There is no right answer here, but in either case, automation performs the bulk of the work, freeing others up to test and monitor.
- Deployment and Rollback: To double-check that code which worked in pre-deployment tests still works in the development environment, the teams we spoke with still do ‘smoke’ tests, but they have evolved these tests to incorporate automation and more granular control over the rollouts. In fact, we typically saw three tricks used to augment deployment. The first, and most powerful, of these techniques is called Blue-Green – or Red-Black – deployments. Simply put, old code and new code run side by side, each on their own set of servers. Rollout is done with a simple redirection from load balancers, and in the event errors are discovered, load balancers are re-directed back to the old code. The second, canary testing, is where a select number of individual sessions are directed towards the new code; first employee testers, then a subset of real customers. If the canary dies (i.e.: any errors encountered), the new code is retired until the issued can be fixed, and the process is repeated. And finally, feature tagging, where new code elements are enabled or disabled through configuration files. In the event errors are discovered in a new section of code, the feature can be toggled off, and the code replaced when fixed. The degree of automation and human intervention varies greatly, but overall, these deployments are far more automated that traditional web services environments.
- Product Security Tests: With the above mentioned deployment models built into release management scripts, it’s fairly easy to have the ‘canaries’ be dynamic code scanners, pen testers or other security oriented testers. When coupled with test accounts specifically used for somewhat invasive security tests, the risk of data corruption is lowered, while still allowing security tests to be performed in the production environment.
I’ve probably missed a few in this discussion, so please feel free to contribute any ideas you feel should be discussed.
In the next post, I am again going to shake up the order in this series, and talk about tools and testing in greater detail. Specifically I will construct a security tool chain for addressing different types of threats, and showing how these fit within DevOps processes.