For those who skip the intro, the biggest security news this week was the passage of CISA, Oracle’s… interesting.. security claims, more discussion on encryption weirdness from the NSA, and security research getting a DMCA exemption. All these stories are linked down below.
Yesterday I hopped in the car, drove over to the kid’s school, and participated in the time-honored tradition of the parent-teacher conference.
I’m still new to this entire “kids in school” thing, with one in first grade and another in kindergarten. Before our kids ever started school I assumed the education system would fail to prepare them for their technological future. That’s an acceptance of demographic realities, not any particular criticism. Look around your non-IT friends and ask how many of them really understand technology and its fundamental underpinnings? Why should teachers be any different?
As large a role as technology plays in every aspect of business and technology, our society still hasn’t crossed the threshold to a majority of the population knowing the fundamentals, beyond surface consumption. That is changing, and will continue to change, but it is a multigenerational shift. And even then I don’t think everyone will (or needs to) understand the full depths of technology like many of us do, but there are entire categories of fundamentals which society will eventually fully integrate – just as we do now with reading, writing, and basic science.
Back to the parent-teacher conference.
During the meeting one teacher handed us a paper with ‘recommended’ iPad apps, because they now assume most students have access to an iPad or iPhone. When she handed it over she said “here’s what our teachers recommend instead of ‘Minecraft’”.
This was a full stop moment for me. Minecraft is one of the single best screen-based tools to teach kids logical thinking and creativity. And yet the school system is actively discouraging Minecraft. Which is a particularly mixed message because I think Minecraft is integrated into other STEM activities (they are in a STEM school), but I need to check. The apps on the list aren’t terrible. Some are quite good. The vast majority are reading and math focused, but there are also a few science and social studies/atlas style apps and games, and everything is grade-appropriate. There are even some creativity apps, like video makers.
On the upside, I think providing a list like this is an exceptionally good idea. Not every parent spends all day reading and writing about technology. On the other hand, nearly all the apps are, well, traditional. There’s only one coding app on the list. Most of the apps are consumption focused, rather than creation focused.
I’m not worried about my kids. They have been emerged in technology since before birth, with an emphasis on building and creating (and sure, they also consume a ton). They also have two parents who work(ed) in IT, and a ridiculously geeky dad who builds Halloween decorations with microcontrollers. As for everyone else? Teachers will catch up. Parents will catch up. Probably not for must of my kids’ peers, but certainly by the time they have children themselves. It takes time for such massive change, and it’s already better than what I saw my 20-year-old niece experience when she ran through the same school district.
I still can’t help but think of some major missed opportunities. For example, I was… volunteered… to help teach Junior Achievement in the school. It’s a well-structured program to introduce kids to the underpinnings of a capitalist society. From participating in Hackid, it looks like there is huge potential to develop a similar program for technology. Some schools, especially in places like Silicon Valley, already have active parents bringing real-world experience into classrooms. It sure would be nice to have something like this on a national scale – beyond ‘events’ like the annual Hour of Code week.
And while we’re at it, we should probably have a program so kids can teach their parents online safety. Because I’m pretty sure most of them intuitively understand it better than most parents I meet.
On to the Summary:
Webcasts, Podcasts, Outside Writing, and Conferences
Other Securosis Posts
Favorite Outside Posts
Research Reports and Presentations
Top News and Posts
Posted at Friday 30th October 2015 2:38 am
(1) Comments •
I have talked a lot about this, but I don’t think I’ve ever posted it here on the blog.
I am consistently amused by people who fear moving to the cloud (and by people who take random potshots at the cloud) because they are worried about a lack of security.
The reality is that cloud providers have a massive financial incentives to be more secure than you. To provide you a rock-solid foundation to build on – and as always, you are free to screw up whatever you want from there. Why? Because if they have a major security failure, it will lose them business, and could become an existential event (an asteroid-vs.-dinosaur type event).
Look at it this way:
- In your own organization, who bears the cost of a security breach? It is almost never the business unit responsible for the breach, but instead almost always paid for out of some central budget. So other priorities nearly always take precedence over security, forcing security teams to block and tackle as best they can. Even the organization itself (depending a bit on the nature of the business) almost never places IT security above priorities such as responding to competitors, meeting product cycle requirements, etc.
- At a public cloud provider, security is typically one of the top 3 obstacles for obtaining customers and growing the business. If they can’t prove security, they cannot win customers. If they can’t maintain security, they most certainly can’t keep customers. Providers have a strong and direct financial motivation to place security at the top of their priorities.
I am not naive enough to think this plays out evenly across the cloud market. I see the most direct correlation with IaaS, largely because those providers are fighting primarily for the enterprise market, where security and compliance are deeper requirements. PaaS is the same way at major IaaS vendors (which is incredibly common), and then prioritization drops off based on:
- Is it a developer-centric tool, or a larger platform?
- Does it target smaller or larger shops?
SaaS is pretty much the Wild West. Major vendors who push hard for enterprise business are typically stronger, but I see plenty of smaller, underresourced SaaS providers where the economics haven’t caught up yet. For example Dropbox had a string of public failures, but eventually prioritized security in response – and then grew, targeting the business market. Box and Microsoft Azure targeted business from the start, and largely avoided Dropbox’s missteps, because their customers and economics required them to be hardened up front.
Once you understand these economics, they can help you evaluate providers. Are they big and aimed at enterprises? Do they have a laundry list of certifications and audit/assessment results? Or are they selling more of a point tool, less mature, still trying to grab market share, and targeting developers or smaller organizations? You cannot quantify this beyond a list of certifications, but it can most certainly feed your Spidey sense.
Posted at Wednesday 28th October 2015 6:05 pm
(0) Comments •
In my recent paper on cloud network security I came down pretty hard on hybrid networks. I have been saying similar things in many presentations, including my most recent RSA session. Enough that I got a request for clarification. Here is some additional detail I will add to the paper; feedback or criticism is appreciated.
Hybrid deployments often play an essential, yet complex, role in an organization’s transition to cloud computing. On the one hand they allow an organization to extend its existing resources directly into the cloud, with fully compatible network addressing and routing. They allow the cloud to access internal assets directly, and internal assets to access cloud assets, without having to reconfigure everything from scratch.
But that also means hybrid deployments bridge risks across environments. Internal problems can extend to the cloud provider, and compromise of something on the cloud side extends to the data center. It’s a situation ripe for error, especially in organizations which already struggle with network compartmentalization. Also, you are bridging two completely different environments – one software defined, the other still managed with boxes and wires.
That’s why we recommend trying to avoid hybrid deployments, to retain the single greatest security advantage of cloud computing: compartmentalization. Modern cloud deployments typically use multiple cloud provider accounts for a single project. If anything goes wrong you can blow away the entire account, and start over. Control failures in any account are isolated to that account, and attacks at the network and management levels are also isolated. Those are typically impossible to replicate with hybrid.
All that said, nearly every large enterprise we work with still needs some hybrid deployments. There are too many existing internal resources and requirements to drop ship them all to a cloud provider. Applications, assets, and services designed for traditional infrastructure which would all need to be completely re-architected to operate correctly, with acceptable resilience, in the cloud.
Yes, someday hybrid clouds will be rare. And for any new project we highly recommend designing to work in an isolated, dedicated set of cloud accounts. But until we all finish this massive 20-year project of moving nearly everything into the public cloud, hybrid is a practical reality.
Thinking about the associated risks, bridging networks and reducing compartmentalization, focuses your security requirements. You need to understand those connections, and the network security controls across them. They are two different systems using a common vocabulary, with important implementation differences. Management planes for non-network functions won’t integrate (traditional environments don’t have one). Host, application, and data security are specific to the assets involved and where they are hosted; risks extend whenever they are connected, regardless of deployment model. A hybrid cloud doesn’t change SQL injection detection or file integrity monitoring – you implement them as needed in each environment.
The definition of hybrid is connection and extension via networking; understanding those connections, how the security rules are set up on each side, and how to ensure the security of two totally different environments works together, is the focus.
Posted at Monday 26th October 2015 9:39 pm
(2) Comments •
About two years ago I was up in Toronto having dinner with James Arlen and Dave Lewis (@myrcurial and @gattaca). Since Dave was serving on the (ISC)2 Board of Directors, and James and I were not CISSPs, the conversation inevitably landed on our feelings as to the relative value of the organization and the certifications.
I have been mildly critical of the CISSP for years. Not rampant hatred, but more an opinion that the cert didn’t achieve its stated goals. It had become less an educational tool, and more something to satisfy HR departments. Not that there is anything inherently wrong with looking for certifications. As an EMT, and a former paramedic, I’ve held at least a dozen or more medical, firefighting, and rescue certifications in my career. Some of them legally required for the job.
(No, I don’t think we can or should do the same for security, but that’s fodder for another day).
While I hadn’t taken the CISSP test, I did once, over a decade earlier, take a week class and look at becoming certified. I was at Gartner at the time and the security team only had one CISSP. So I was familiar with the CBK, which quickly disillusioned me. It barely seemed to reflect the skills base that current, operational security professionals needed. It wasn’t all bad, it just wasn’t on target.
Then I looked at the ethics requirements, which asked if you ever “associated with hackers”. Now I know they meant “criminals” but that isn’t what was on paper, and, to me, that is the kind of mistake that reflects a lack of understanding as to the power of words. Or even the meaning of the word, and from an organizations that represents the very profession most directly tied to the hacker community. Out of touch content and a poorly written code of ethics wasn’t something I felt I needed to support, and thanks to where I was in my career I didn’t need it.
To be honest, James and I teamed up a bit on Dave that night. Asking him why he would devote so much time to an organization he, as a hacker, technically couldn’t even be a part of. That’s right about the time he told us to put up or shut up.
You see Dave helped get the code of ethics updated and had that provision removed. And he, and other board members, had launched a major initiative to update the exam and the CBK. He challenged us to take the test, THEN tell him what we thought. (He had us issued tokens, so we didn’t pay for the exam). He saw the (ISC)2 not merely as a certification entity, but as a professional organization with a membership and position to actually advance the state of the profession, with the right leadership (and support of the members).
James and I each later took the exam (nearly a year later in my case). James and I each approached the exam differently – he studied, I went in cold. Then we sent feedback on our experience to Dave to pass on to the organization. We wanted to see if the content was representative of what security pros really need to know to get their jobs done. While I can’t discuss the content, it was better than I expected, but still not where I thought it needed to be. (There was one version back from the current exam).
Over that time additional friends and people I respect joined the Board, and continued to steer the (ISC)2 in interesting directions.
I never planned on actually getting my CISSP. It really isn’t something I needed at this point in my career. But the (ISC)2 and the Cloud Security Alliance had recently teamed up on a new certification that was directly tied to the CCSK we (Securosis) manage for the CSA, and I was gently pressured to become more involved in the relationship and course content. Plus, my friends in the (ISC)2 made a really important, personally impactful point.
As a profession we face the greatest social, political, and operational challenges since our inception. Every day we are in the headlines, called before lawmakers, and fighting bad guys and, at times, our own internal political battles. But our only representation, speaking in our name, is lone individuals and profit-oriented companies. The (ISC)2 is potentially positioned to play a very different role. It’s not for profit, run by directors chosen in open elections. The people I knew who were active in the organization saw the chance, see the chance, for it to continue to evolve into something more than a certification shop.
I submitted my paperwork. Then, the same day I was issued my certification, I found out I was nominated for the Board. Sorta didn’t really expect that.
Accepting wasn’t a simple decision. I already travel a lot, and had to talk it over with my wife and coworkers (both of whom advised me not to do it, due to the time commitment). But something kept nagging at me.
We really do need a voice. An organization with the clout and backing to represent the profession. Now I fundamentally don’t believe any third party can ever represent all the opinions of any constituency. I sure as hell have no right to assume I speak for everyone with ‘security’ in their title, but without some mutual agreement all that will happen is those with essentially no understanding of what we do will make many of the decisions that decide our future.
That’s why I’m running for the Board of the (ISC)2.
Because to play that role, the organization needs to continue to change. It needs to become more inclusive, with a wider range of certification and membership options, which better reflect operational security needs. It should also reach out more to a wider range of the community, particularly researchers, offensive security professionals, and newer, less experienced security pros. It needs to actually offer them something; something more than a piece of paper that will help their resume get through an HR department.
We also need to update the code of ethics and drop the provision to “protect the profession”, since that can easily be seen as defensiveness in a profession that, I feel, should be publicly self-critical. And the code of ethics should account for the inherent conflicts when you discover serious security issues that pit public trust against the desires of your employers or principals. And no, there is no easy answer.
And in terms of certification, I’d like to see more inclusion of hands-on requirements and opportunities. All my paramedic and firefighter training included both didactic and practical requirements.
For those of you who are eligible, I’m not going to ask for your vote. You need to decide for yourself whether everything I just shared represents your views. For those of you wondering why the heck I got a CISSP and decided to run for the Board, now you know. I can blog and speak and write as many papers as I want, but none of those will actually advance the profession in any meaningful way. Maybe joining the (ISC)2 won’t either, but I won’t know until I try.
Posted at Friday 23rd October 2015 4:52 pm
(3) Comments •
Every week, we here at Securosis like to highlight the security industry’s most important news in our Friday Summary. Those events that not only made the press, but are likely to significantly impact your professional lives and, potentially, the well-being of the organization you work for.
Ah, who am I kidding, let’s talk Star Wars.
If you didn’t know a new trailer for The Force Awakens was released this week, you can’t be reading this statement, because you are either deceased (like a parrot) or currently imprisoned in an underground bunker by a religious fanatic who is feeding you nutritional supplements so he/she can harvest your organs and live for eternity. I can’t imagine any other legitimate options.
Stick with me for a minute – I really do have a point or two.
Like many of you, Star Wars played an incredibly influential role in my life. The first film hit when I was six, and it helped form the person I would eventually become. I know, cheesy and maybe weird or nerdy, but as children we all grab onto stories and metaphor to develop our own worldview. For some of you it was religion (that is pretty much the purpose of the Bible), or a book series, or a blend of influences. For me, Star Wars always stood far above and beyond anything else outside the direct guidance of my parents.
Martial arts, public service, a love of aviation and space, and a fundamental recognition of the importance of helping and protecting others all trace back, to some degree, to the film series. Perhaps I would have grabbed onto these principles anyway, but at this point that experiment’s control group vaporized decades ago.
I have, perhaps, an overconfidence in the new film. I’ve already bought tickets for opening night and the following day, and could only stop tearing up at the trailer through intense immersion therapy. Unlimited bandwidth FTW.
There was a fascinating article in the New Yorker this week. The author admitted a love for the original trilogy, but claimed now that we are adults, there is no chance for a new entry to create the same wonder as the originals did for thousands (millions?) of children in theaters. That the new films must, of necessity, be for children, as adults are no longer of generating such emotions.
You know, pretty much what you would expect The New Yorker to publish.
The day I no longer believe a story can make me feel wonder is the day I ask Reverend Billy to finally remove my dead heart and implant it in that goat that makes our cheese (in the bunker, keep up people). Maybe the new film won’t hit that lofty goal (although the trailer sure did), but you can’t close your mind to the possibility. Okay, maybe Star Wars isn’t your thing, but if you no longer believe stories even have the potential to engender childlike joy, that’s a loss of hope with profound personal implications.
I’m also fascinated to see how Star Wars changes for my children. Already the expanded universe is creating a different relationship with the canon. Growing up I only had Artoo and Threepio, but they now have Chopper (from Rebels, a really great show) and BB-8. My two year old is already obsessed with BB-8 and insists my Sphero toy sit next to him when he watches TV. When the battery runs out he likes to tell me “BB-8 sad”.
They will never experience things the way I did. Maybe they’ll love it, maybe they won’t, that’s up for them to decide (after my meddling influence). But there is one aspect of the new films that, as a parent, endlessly excites me.
The prequels weren’t merely bad films, they did nearly nothing to advance the story. They gave us the visuals of the history of Vader, and a few poorly retconned story beats, but they didn’t tell us anything material we didn’t already know. There was no anticipation between the films, not like when Empire came out and my friends and I spent 3 years debating if Darth was Luke’s father, or if it was merely another Sith lie.
In two months we get to see an entirely new Star Wars that continues the story that started nearly 40 years ago. And, though I’m really just guessing here, I’m pretty sure Episode VII is going to end in a cliffhanger that won’t be resolved for another two to three years, if not the full six years to finish this next trilogy.
My children will get a new story that will play out over a third of their childhood. Not some movies based on some existing books, however well written and popular. Not a television series they see every week or can marathon on Netflix. Three films. Six years. So popular (just a guess) that they extend Star Wars’ already deep influence in our global consciousness. The ending unknown until my entire family, the youngest now eight or nine (not two), the oldest bordering on a teenager, sits together in the theater as the lights dim, the curtain peels back, and the familiar fanfare blasts from the speakers.
No, maybe I won’t ever feel the same as that day in 1977 when I sat next to my father and that first Star Destroyer loomed above our heads. I’m older, capable of far more emotional depth, with an ever greater need to escape the responsibilities of adulthood and the painful irrationality of the real world. Knowing that my children sitting next to me are building their own memories, and are experiencing their own wonder.
It’s going to be so much better.
On to the Summary:
Webcasts, Podcasts, Outside Writing, and Conferences
Favorite Outside Posts
Research Reports and Presentations
Top News and Posts
Posted at Friday 23rd October 2015 5:55 am
(0) Comments •
By Mike Rothman
It has been a while since I’ve mentioned my gang of kids. XX1, XX2 and the Boy are alive and well, despite the best efforts of their Dad. All of them started new schools this year, with XX1 starting high school (holy crap!) and the twins starting middle school. So there has been a lot of adjustment. They are growing up and it’s great to see. It’s also fun because I can start to pollute them with the stuff that I find entertaining.
Like classic comedies. I’ve always been a big fan of Monty Python, but that wasn’t really something I could show an 8-year-old. Not without getting a visit from Social Services. I knew they were ready when I pulled up a YouTube of the classic Mr. Creosote sketch from The Meaning of Life, and they were howling. Even better was when we went to the FroYo (which evidently is the abbreviation for frozen yogurt) place and they reminded me it was only a wafer-thin mint.
I decided to press my luck, so one Saturday night we watched Monty Python and the Holy Grail. They liked it, especially the skit with the Black Knight (It’s merely a flesh wound!). And the ending really threw them for a loop. Which made me laugh. A lot. Inspired by that, I bought the Mel Brooks box set, and the kids and I watched History of the World, Part 1, and laughed. A lot. Starting with the gorilla scene, we were howling through the entire movie. Now at random times I’ll be told that “it’s good to be the king!” – and it is.
My other parenting win was when XX1 had to do a project at school to come up with a family shield. She was surprised that the Rothman clan didn’t already have one. I guess I missed that project in high school. She decided that our family animal would be the Honey Badger. Mostly because the honey badger doesn’t give a _s**t_. Yes, I do love that girl. Even better, she sent me a Dubsmash, which is evidently a thing, of her talking over the famous Honey Badger clip on YouTube. I was cracking up.
I have been doing that a lot lately. Laughing, that is. And it’s great. Sometimes I get a little too intense (yes, really!) and it’s nice to have some foils in the house now, who can help me see the humor in things. Even better, they understand my sarcasm and routinely give it right back to me. So I am training the next generation to function in the world, by not taking themselves so seriously, and that may be the biggest win of all.
Photo credit: “Horse Laugh” originally uploaded by Bill Gracey
Thanks to everyone who contributed to my Team in Training run to battle blood cancers. We’ve raised almost $6,000 so far, which is incredible. I am overwhelmed with gratitude. You can read my story in a recent Incite, and then hopefully contribute (tax-deductible) whatever you can afford. Thank you.
The fine folks at the RSA Conference posted the talk Jennifer Minella and I did on mindfulness at the 2014 conference. You can check it out on YouTube. Take an hour. Your emails, alerts, and Twitter timeline will be there when you get back.
Have you checked out our new video podcast? Rich, Adrian, and Mike get into a Google Hangout and… hang out. We talk a bit about security as well. We try to keep these to 15 minutes or less, and usually fail.
We are back at work on a variety of blog series, so here is a list of the research currently underway. Remember you can get our Heavy Feed via RSS, with our content in all its unabridged glory. And you can get all our research papers too.
Building Security into DevOps
Building a Threat Intelligence Program
Network Security Gateway Evolution
Recently Published Papers
Incite 4 U
The cloud poster child: As discussed in this week’s FireStarter, the cloud is happening faster than we expected. And that means security folks need to think about things differently. As if you needed more confirmation, check out this VentureBeat profile of Netflix and their movement towards shutting down their data centers to go all Amazon Web Services. The author of the article calls this the future of enterprise tech and we agree. Does that mean existing compute, networking, and storage vendors go away? Not overnight, but in 10-15 years infrastructure will look radically different. Radically. But in the meantime, things are happening fast, and folks like Netflix are leading the way. – MR
Future – in the past tense: TechCrunch recently posted The Future of Coding Is Here, outlining how the arrival of APIs (Application Programming Interfaces) has ushered in a new era of application development. The fact is that RESTful APIs have pretty much been the lingua franca of software development since 2013, with thousands of APIs available for common services. By the end of 2013 every major API gateway vendor had been acquired by a big IT company. That was because APIs are an enabling technology, speeding integration and deployment, and making it easy to leverage everything from mobile to the Internet of Things. And don’t even bother trying to use cloud services without leveraging vendor APIs. But the OWASP Top Ten will not change any time soon, as traditional web-facing apps and browsers still provide too many attractive targets for attackers to forsake them. – AL
Cheaters gonna cheat: Crowdstrike published some interesting research recently, discussing how they detected the Chinese hacking US commercial entities, even after the landmark September 25 agreement not to. Now, of course, there could have been a lag between when the agreement was signed and when new marching orders made it to the front lines. Especially when you send the message by Pony Express. Turns out there are things like email, phones, and maybe even these newfangled things called “web sites” to make sure everyone knows about changes in policy. But did you really expect a political agreement to change anything? Me neither. So just like cheaters are gonna cheat, nations states are gonna hack. – MR
Stealing from spies: Hackers have figured out how to uncloak advertising links embedded in iFrames by exploiting the relationship between two frames. For those of us who think iFrames are an attack vector themselves, it’s no surprise that this dodgy means of tracking users and supporting ad networks was cracked by bad (worse?) guys. The good news is that it does not expose any additional user information, but it does allow attackers to manipulate ad clicks. Most tricks, hacks, and sneaky methods of scraping data or force user browsers to take action were pioneered by some marketing firm to game the system. The problem is that dodgy habits are endemic to how many very large companies make money, so we get hacked solutions to compensate for the hacks these firms leverage to satisfy their own profit motive. Until the economics change, hackers will have plenty of ‘features’ from ad, social, and analytics networks to exploit and profit. – AL
A cyberinsurance buffet: Warren Buffett has done pretty well by sticking to things he can understand. OK, maybe that’s the understatement of the millennium. His Specialty Insurance business getting into underwriting cyber policies seems to run counter to that philosophy. He wouldn’t even invest in tech companies, but now he’s willing to value something that you pretty much can’t value (cyber-exposure). Of course it’s not Warren himself writing the policies. But all the same, and maybe it’s just me, but it is not clear how to write these policies – even the best defenses can be breached at any time by sophisticated attackers. I’m happy to hear explanations, because I still don’t get this. – MR
Posted at Wednesday 21st October 2015 9:21 am
(0) Comments •
A bit over a week ago we were all out at Amazon’s big cloud conference, which is now up to 19,000 attendees. Once again it got us thinking as to how quickly the world is changing, and the impact it will have on our profession. Now that big companies are rapidly adopting public cloud (and they are), that change is going to hit even faster than ever before. In this episode the Securosis team lays out some of what that means, and how now is the time to get on board.
Watch or listen:
Posted at Tuesday 20th October 2015 12:29 pm
(0) Comments •
Last week Mike, Adrian, and myself were out at the Amazon re:Invent conference. It’s the third year I’ve attended and it’s become one of the core events of the year for me; even more important than most of the security events. To put things in perspective, there were over 19,000 attendees and this is only the fourth year of the conference.
While there I tweeted that all security professionals need to get their asses to some non-security conferences. Specifically, to cloud or DevOps events. It doesn’t need to be Amazon’s show, but certainly needs to be one from either a major public cloud provider (and really, only Microsoft and Google are on that list right now), or something like the DevOps Enterprise Summit next week (which I have to miss).
I always thought cloud and automation in general, and public cloud and DevOps (once I learned the name) in particular, would become the dominant operational model and framework for IT. What I absolutely underestimated is how friggen fast the change would happen. We are, flat out, three years ahead of my expectations, in terms of adoption.
Nearly all my hallway conversations at re:Invent this year were with large enterprises, not the startups and mid-market of the first year. And we had plenty of time for those conversations, since Amazon needs to seriously improve their con traffic management.
With cloud, our infrastructure is now software defined. With DevOps (defined as a collection of things beyond the scope of this post), our operations also become software defined (since automation is essential to operating in the cloud). Which means, well, you know what this means…
We live in a developer’s world.
This shouldn’t be any sort of big surprise. IT always runs through phases where one particular group is relatively “dominant” in defining our enterprise use of technology. From mainframe admins, to network admins, to database admins, we’ve circled around based on which pieces of our guts became most-essential to running the business.
I’m on record as saying cloud computing is far more disruptive than our adoption of the Internet. The biggest impact on security and operations is this transition to software defined everything. Yes, somewhere someone still needs to wire the boxes together, but it won’t be most of the technology workforce.
Which means we need to internalize this change, and start understanding the world of those we will rely on to enable our operations. If you aren’t a programmer, you need to get to know them, especially since the tools we typically rely on are moving much more slowly than the platforms we run everything on. One of the best ways to do this is to start going to some outside (of security) events.
And I’m dead serious that you shouldn’t merely go to a cloud or DevOps track at a security conference, but immerse yourself at a dedicated cloud or DevOps show. It’s important to understand the culture and priorities, not merely the technology or our profession’s interpretation of it. Consider it an intelligence gathering exercise to learn where the rest of your organization is headed.
I’m sure there’s an appropriate Sun Tsu quote out there, but if I used it I’d have to nuke this entire site and move to a security commune in the South Bay. Or Austin. I hear Austin’s security scene is pretty hot.
Oh- and, being Friday, I suppose I should insert the Friday Summary below and save myself a post.
On to the Summary:
Webcasts, Podcasts, Outside Writing, and Conferences
A bunch of stuff this week, but the first item, Mike’s keynote, is really the one to take a look at.
Recent Securosis Posts
Favorite Outside Posts
Research Reports and Presentations
Top News and Posts
Posted at Friday 16th October 2015 5:56 am
(0) Comments •
By Adrian Lane
In today’s post I am going to talk about the role of security folks in DevOps. A while back we provided a research paper on Putting Security Into Agile Development; the feedback we got was the most helpful part of that report was guiding security people on how best to work with development. How best to position security in a way to help development teams be more Agile was successful, so this portion of our research on DevOps we will strive to provide similar examples of the role of security in DevOps.
There is another important aspect we feel frames today’s discussion; there really is no such thing as SecDevOps. The beauty of DevOps is security becomes part of the operational process of integrating and delivering code. We don’t call security out as a separate thing because it is actually not separate, but (can be) intrinsic to the DevOps framework. We want security professionals to keep this in mind when consider how they fit within this new development framework. You will need to play one or more roles in the DevOps model of software delivery, and need to look at how you improve the delivery of secure code without waste nor introducing bottlenecks. The good news is that security fits within this framework nicely, but you’ll need to tailor what security tests and tools fit within the overall model your firm employs.
The CISO’s responsibilities
- Learn the DevOps process: If you’re going to work in a DevOps environment, you’re going to need to understand what it is, and how it works. You need to understand what build servers do, how test environments are built up, the concept of fully automated work environments, and what gates each step in the process. Find someone on the team and have them walk you through the process and introdyce the tools. Once you understand the process the security integration points become clear. Once you understand the mechanics of the development team, the best way to introduce different types of security testing also become evident.
- Learn how to be agile: Your participation in a DevOps team means you need to fit into DevOps, not the other way around. The goal of DevOps is fast, faster, fastest; small iterative changes that offer quick feedback. You need to adjust requirements and recommendations so they can be part of the process, and as hand-off and automated as possible. If you’re going to recommend manual code reviews or fuzz testing, that’s fine, but you need to understand where these tests fit within the process, and what can - or cannot - gate a release.
How CISOs Support DevOps
Training and Awareness
- Educate: Our experience shows one of the best ways to bring a development team up to speed in security is training; in-house explanations or demonstrations, 3rd party experts to help threat model an application, eLearning, or courses offered by various commercial firms. The downside of this historically has been cost, with many classes costing thousands of dollars. You’ll need to evaluate how best to use your resources, which usually includes some eLearning for all employees to use, and having select people attend a class then teach their peers. On site experts can also be expensive, but you can have an entire group participate in training.
- Grow your own support: Security teams are typically small, and often lack budget. What’s more is security people are not present in many development meetings; they lack visibility in day to day DevOps activities. To help extend the reach of the security team, see if you can get someone on each development team to act as an advocate for security. This helps not only extend the reach of the security team, but also helps grow awareness in the development process.
- Help them understand threats: Most developers don’t fully grasp how attacks approach attacking a system, or what it means when a SQL injection attack is possible. The depth and breadth of security threats is outside their experience, and most firms do not teach threat modeling. The OWASP Top Ten is a good guide to the types of code deficiencies that plague development teams, but map these threats back to real world examples, and show the extent of damage that can occur from a SQL Injection attack, or how a Heartbleed type vulnerability can completely expose customer credentials. Real world use cases go a long way to helping developer and IT understand why protection from certain threats is critical to application functions.
- Have a plan: The entirety of your security program should not be ‘encrypt data’ or ‘install WAF’. It’s all too often that developers and IT has a single idea as to what constitutes security, and it’s centered on a single tool they want to set and forget. Help build out the elements of the security program, including both in-code enhancements as well as supporting tools, and how each effort helps address specific threats.
- Help evaluate security tools: It’s common for people outside of security to not understand what security tools do, or how they work. Misconceptions are rampant, and not just because security vendors over-promise capabilities, but it’s uncommon for developers to evaluate code scanners, activity monitors or even patch management systems. In your role as advisor, it’s up to you to help DevOps understand what the tools can provide, and what fits within your testing framework. Sure, you may not be able to evaluate the quality of the API, but you can certainly tell when a product does not actually deliver meaningful results.
- Help with priorities: Not every vulnerability is a risk. And worse, security folks have a long history of sounding like the terrorism threat scale, with vague warnings about ‘severe risk’ or ‘high threat levels’. None of these warnings are valuable without mapping the threat to possible exploitations, or what you can do to address – or reduce – the risks. For example, an application may have a critical vulnerability, but you have options of fixing it in the code, patch supporting systems, disabling the feature if it’s not critical, blocking with IDS or firewalls, or even filtering with WAF or RAST technologies. Rather than the knee-jerk ‘OMG! Fix it NOW’ reaction we’ve historically seen, there are commonly several options to address vulnerabilities, so present the tradeoffs DevOps team allows them select which fits within their systems.
- Help test: DevOps has placed some operations and release management personnel in the uncomfortable position of having to learn to script, code, and have their work open for public review. It’s places people outside their comfort zone in the short term, but it’s part of building a cohesive team in the mid-term. It’s perfectly acceptable for security folks to also contribute tests to the team; scans that validate certificates, known SQL Injection attacks, open source tools for locating vulnerabilities and so on. If you’re worried about it, help out, and integrate unit and regression tests. Integrate and ingratiate! You may need to learn a bit on the scripting side before you’re tests can be integrated into the build and deployment servers, but you’ll do more than preach security, you’ll contribute!
In this way you can also act as the liaison between compliance and DevOps team, shielding them from some of the politics and bureaucracy.
Posted at Monday 12th October 2015 11:54 am
(0) Comments •
By Mike Rothman
As we dive back into the Threat Intelligence Program, we have summarized why a TI program is important and how to (gather intelligence. Now we need a programmatic approach for using TI to improve your security posture and accelerate your response & investigation functions.
To reiterate (because it has been a few weeks since the last post), TI allows you to benefit from the misfortune of others, meaning it’s likely that other organizations will get hit with attacks before you, so you should learn from their experience. Like the old quote, “Wise men learn from their mistakes, but wiser men learn from the mistakes of others.” But knowing what’s happened to others isn’t enough. You must be able to use TI in your security program to gain any benefit.
First things first. We have plenty of security data available today. So the first step in your program is to gather the appropriate security data to address your use case. That means taking a strategic view of your data collection process, both internally (collecting your data) and externally (aggregating threat intelligence). As described in our last post, you need to define your requirements (use cases, adversaries, alerting or blocking, integrating with monitors/controls, automation, etc.), select the best sources, and then budget for access to the data.
This post will focus on using threat intelligence. First we will discuss how to aggregate TI, then on using it to solve key use cases, and finally on tuning your ongoing TI gathering process to get maximum value from the TI you collect.
When aggregating threat intelligence the first decision is where to put the data. You need it somewhere it can be integrated with your key controls and monitors, and provide some level of security and reliability. Even better if you can gather metrics regarding which data sources are the most useful, so you can optimize your spending. Start by asking some key questions:
- To platform or not to platform? Do you need a standalone platform or can you leverage an existing tool like a SIEM? Of course it depends on your use cases, and the amount of manipulation & analysis you need to perform on your TI to make it useful.
- Should you use your provider’s portal? Each TI provider offers a portal you can use to get alerts, manipulate data, etc. Will it be good enough to solve your problems? Do you have an issue with some of your data residing in a TI vendor’s cloud? Or do you need the data to be pumped into your own systems, and how will that happen?
- How will you integrate the data into your systems? If you do need to leverage your own systems, how will the TI get there? Are you depending on a standard format like STIX/TAXXI? Do you expect out-of-the-box integrations?
Obviously these questions are pretty high-level, and you’ll probably need a couple dozen follow-ups to fully understand the situation.
Selecting the Platform
In a nutshell, if you have a dedicated team to evaluate and leverage TI, have multiple monitoring and/or enforcement points, or want more flexibility in how broadly you use TI, you should probably consider a separate intelligence platform or ‘clearinghouse’ to manage TI feeds. Assuming that’s the case, here are a few key selection criteria to consider when selecting a stand-alone threat intelligence platforms:
- Open: The TI platform’s task is to aggregate information, so it must be easy to get information into it. Intelligence feeds are typically just data (often XML), and increasingly distributed in industry-standard formats such as STIX, which make integration relatively straightforward. But make sure any platform you select will support the data feeds you need. Be sure you can use the data that’s important to you, and not be restricted by your platform.
- Scalable: You will use a lot of data in your threat intelligence process, so scalability is essential. But computational scalability is likely more important than storage scalability – you will be intensively searching and mining aggregated data, so you need robust indexing. Unfortunately scalability is hard to test in a lab, so ensure your proof of concept testbed is a close match for your production environment, and that you can extrapolate how the platform will scale in your production environment.
- Search: Threat intelligence, like the rest of security, doesn’t lend itself to absolute answers. So make TI the beginning of your process of figuring out what happened in your environment, and leverage the data for your key use cases as we described earlier. One clear requirement for all use cases is search. Be sure your platform makes searching all your TI data sources easy.
- Scoring: Using Threat Intelligence is all about betting on which attackers, attacks, and assets are most important to worry about, so a flexible scoring mechanism offers considerable value. Scoring factors should include assets, intelligence sources, and attacks, so you can calculate a useful urgency score. It might be as simple as red/yellow/green, depending on the sophistication of your security program.
Key Use Cases
Our previous research has focused on how to address these key use cases, including preventative controls (FW/IPS), security monitoring, and incident response. But a programmatic view requires expanding the general concepts around use cases into a repeatable structure, to ensure ongoing efficiency and effectiveness.
The general process to integrate TI into your use cases is consistent, with some variations we will discuss below under specific use cases.
- Integrate: The first step is to integrate the TI into the tools for each use case, which could be security devices or monitors. That may involve leveraging the management consoles of the tools to pull in the data and apply the controls. For simple TI sources such as IP reputation, this direct approach works well. For more complicated data sources you’ll want to perform some aggregation and analysis on the TI before updating rules running on the tools. In that case you’ll expect your TI platform for integrate with the tools.
- Test and Trust: The key concept here is trustable automation. You want to make sure any rule changes driven by TI go through a testing process before being deployed for real. That involves monitor mode on your devices, and ensuring changes won’t cause excessive false positives or take down any networks (in the case of preventative controls). Given the general resistance of many network operational folks to automation, it may be a while before everyone trusts automatic changes, so factor that into your project planning.
- Tuning via Feedback: In our dynamic world the rules that work today and the TI that is useful now will need to evolve. So you’ll constantly be tuning your TI and rulesets to optimize for effectiveness and efficiency. You are never done, and will constantly need to tune and assess new TI sources to ensure your defenses stay current.
From a programmatic standpoint, you can look back to our Applied Threat Intelligence research for granular process maps for integrating threat intelligence with each use case in more detail.
The idea when using TI within a preventative control is to use external data to identify what to look for before it impacts your environment. By ‘preventative’ we mean any control that is inline and can prevent attacks, not just alert. These include:
- Network security devices: This category encompasses firewalls (including next-generation models) and Intrusion Prevention Systems. But you might also include devices such as web application firewalls, which operate at different levels in the stack but are inline and can block attacks.
- Content security devices/services: Web and email filters can also function as preventative controls because they inspect traffic as it passes through, and can enforce policies to block attacks.
- Endpoint security technologies: Protecting an endpoint is a broad mandate, and can include traditional endpoint protection (anti-malware) and newfangled advanced endpoint protection technologies such as isolation and advanced heuristics.
We want to use TI to block recognized attacks, but not crater your environment with false positives, or adversely impact availability.
So the greatest sensitivity, and the longest period of test and trust, will be for preventative controls. You only get one opportunity to take down your network with an automated TI-driven rule set, so make sure you are ready before you deploy blocking rules operationally.
Our next case uses Threat Intelligence to make security monitoring more effective. As we’ve written countless times, security monitoring is necessary because you simply cannot prevent everything, so you need to get better and faster at responding. Improving detection is critical to effectively shortening the window between compromise and discovery.
Why is this better than just looking for well-established attack patterns like privilege escalation or reconnaissance, as we learned in SIEM school? The simple answer is that TI data represents attacks happening right now on other networks. Attacks you otherwise wouldn’t see or know to look for until too late. In a security monitoring context leveraging TI enables you to focus your validation/triage efforts, detect faster and more effectively, and ultimately make better use of scarce resources which need to be directed at the most important current risk.
- Aggregate Security Data: The foundation for any security monitoring process is internal security data. So before you can worry about external threat intel, you need to enumerate devices to monitor in your environment, scope out the kinds of data you will get from them, and define collection policies and correlation rules. Once this data is available in a repository for flexible, fast, and efficient search and analysis you are ready to start integrating external data.
- Security Analytics: Once the TI is integrated, you let the advanced math of your analytics engine do its magic, correlating and alerting on situations that warrant triage and possibly deeper investigation.
- Action/Escalation: Once you have an alert, and have gathered data about the device and attack, you need to determine whether the device was actually compromised or the alert was a false positive. Once you verify an attack you’ll have a lot of data to send to the next level of escalation – typically an incident response process.
The margin for error is a bit larger when integrating TI into a monitoring context than a preventative control, but you still don’t want to generate a ton of false positives and have operational folks running around chasing then. Testing and tuning processes remain critical (are you getting the point yet?) to ensure that TI provides sustainable benefit instead of just creating more work.
Similar to the way threat intelligence helps with security monitoring, you can use TI to focus investigations on the devices most likely to be impacted, help identify adversaries, and lay out their tactics to streamline your response. Just to revisit the general steps of an investigation, here’s a high-level view of incident response:
- Phase 1: Current Assessment: This involves triggering your process and escalating to the response team, then triaging the situation to figure out what’s really at risk. A deeper analysis follows to prove or disprove your initial assessment and figure out whether it’s a small issue or a raging fire.
- Phase 2: Investigate: Once the response process is fully engaged you need to get the impacted devices out of harm’s way by quarantining them and taking forensically clean images for chain of custody. Then you can start to investigate the attack more deeply to understand your adversary’s tactics, build a timeline of the attack, and figure out what happened and what was lost.
- Phase 3: Mitigation and Clean-up: Once you have completed your investigation you can determine the appropriate mitigations to eradicate the adversary from your environment and clean up the impacted parts of the network. The goal is to return to normal business operations as quickly as possible. Finally you’ll want a post-mortem after the incident is taken care of, to learn from the issues and make sure they don’t happen again.
The same concepts apply as in the other use cases. You’ll want to integrate the TI into your response process, typically looking to match indicators and tactics against specific adversaries to understand their motives, profile their activities, and get a feel for what is likely to come next. This helps to understand the level of mitigation necessary, and determine whether you need to involve law enforcement.
Optimizing TI Spending
The final aspect of the program for today’s discussion is the need to optimize which data sources you use – especially the ones you pay for. Your system should be tuned to normalize and reduce redundant events, so you’ll need a process to evaluate the usefulness of your TI feeds. Obviously you should avoid overlap when buying feeds, so understand how each intelligence vendor gets their data. Do they use honeypots? Do they mine DNS traffic and track new domain registrations? Have they built a cloud-based malware analysis/sandboxing capability? Categorize vendors by their tactics to help pick the best fit for your requirements.
Once the initial data sources are integrated into your platform and/or controls you’ll want to start tracking effectiveness. How many alerts are generated by each source? Are they legitimate? The key here is the ability to track this data, and if these capabilities are not built into the platform you are using, you’ll need to manually instrument the system to extract this kind of data. Sizable organizations invest substantially in TI data, and you want to make sure you get a suitable return on that investment.
At this point you have a systematic program in place to address your key use cases with threat intelligence. But taking your TI program to the next level requires you to think outside your contained world. That means becoming part of a community to increase the velocity of your feedback loop, and be a contributor to the TI ecosystem rather than just a taker. So our next post will focus on how you can securely share what you’ve learned through your program to help others.
Posted at Wednesday 7th October 2015 7:21 pm
(0) Comments •
By Adrian Lane
Thus far I’ve been making the claim that security can be woven into the very fabric of your DevOps framework; now it’s time to show exactly how. DevOps encourages testing at all phases in the process, and the earlier the better. From the developers desktop prior to check-in, to module testing, and against a full application stack, both pre and post deployment - it’s all available to you.
Where to test
- Unit testing: Unit testing is nothing more than running tests again small sub-components or fragments of an application. These tests are written by the programmer as they develop new functions, and commonly run by the developer prior to code checkin. However, these tests are intended to be long-lived, checked into source repository along with new code, and run by any subsequent developers who contribute to that code module. For security, these be straightforward tests – such as SQL Injection against a web form – to more complex attacks specific to the function, such as logic attacks to ensure the new bit of code correctly reacts to a users intent. Regardless of the test intent, unit tests are focused on specific pieces of code, and not systemic or transactional in nature. And they intended to catch errors very early in the process, following the Deming ideal that the earlier flaws are identified, the less expensive they are to fix. In building out your unit tests, you’ll need to both support developer infrastructure to harness these tests, but also encourage the team culturally to take these tests seriously enough to build good tests. Having multiple team member contribute to the same code, each writing unit tests, helps identify weaknesses the other did not not consider.
- Security Regression tests: A regression test is one which validates recently changed code still functions as intended. In a security context this it is particularly important to ensure that previously fixed vulnerabilities remain fixed. For DevOps regression tests can are commonly run in parallel to functional tests – which means after the code stack is built out – but in a dedicated environment security testing can be destructive and cause unwanted side-effects. Virtualization and cloud infrastructure are leveraged to aid quick start-up of new test environments. The tests themselves are a combination of home-built test cases, created to exploit previously discovered vulnerabilities, and supplemented by commercial testing tools available via API for easy integration. Automated vulnerability scanners and dynamic code scanners are a couple of examples.
- Production Runtime testing: As we mentioned in the Deployment section of the last post, many organizations are taking advantage of blue-green deployments to run tests of all types against new production code. While the old code continues to serves user requests, new code is available only to select users or test harnesses. The idea is the tests represent a real production environment, but the automated environment makes this far easier to set up, and easier to roll back in the event of errors.
- Other: Balancing thoroughness and timelines is a battle for most organization. The goal is to test and deploy quickly, with many organizations who embrace CD releasing new code a minimum of 10 times a day. Both the quality and depth of testing becomes more important issue: If you’ve massaged your CD pipeline to deliver every hour, but it takes a week for static or dynamic scans, how do you incorporate these tests? It’s for this reason that some organizations do not do automated releases, rather wrap releases into a ‘sprint’, running a complete testing cycle against the results of the last development sprint. Still others take periodic snap-shops of the code and run white box tests in parallel, but do gate release on the results, choosing to address findings with new task cards. Another way to look at this problem, just like all of your Dev and Ops processes will go through iterative and continual improvement, what constitutes ‘done’ in regards to security testing prior to release will need continual adjustment as well. You may add more unit and regression tests over time, and more of the load gets shifted onto developers before they check code in.
Building a Tool Chain
The following is a list of commonly used security testing techniques, the value they provide, and where they fit into a DevOps process. Many of you reading this will already understand the value of tools, but perhaps not how they fit within a DevOps framework, so we will contrast traditional vs. DevOps deployments. Odds are you will use many, if not all, of these approaches; breadth of testing helps thoroughly identify weaknesses in the code, and better understand if the issues are genuine threats to application security.
- Static analysis: Static Application Security Testing (SAST) examine all code – or runtime binaries – providing a thorough examination for common vulnerabilities. These tools are highly effective at finding flaws, often within code that has been reviewed manually. Most of the platforms have gotten much better at providing analysis that is meaningful to developers, not just security geeks. And many are updating their products to offer full functionality via APIs or build scripts. If you can, you’ll want to select tools that don’t require ‘code complete’ or fail to offer APIs for integration into the DevOps process. Also note we’ve seen a slight reduction in use as these tests often take hours or days to run; in a DevOps environment that may eliminate in line tests as a gate to certification or deployment. Most organizations, as we mentioned in the above section labelled ‘Other’, teams are adjusting to out of band testing with static analysis scanners. We highly recommend keeping SAST testing as part of the process and, if possible, are focused on new sections of code only to reduce the duration of the scan.
- Dynamic analysis: Dynamic Application Security Testing (DAST), rather than scan code or binaries as SAST tools above, dynamically ‘crawl’ through an application’s interface, testing how the application reacts to inputs. While these scanners do not see what’s going on behind the scenes, they do offer a very real look at how code behaves, and can flush out errors in dynamic code paths that other tests may not see. These tests are typically run against fully built applications, and as these tests can be destructive, the tools often have settings to allow you to be more aggressive tests to be run in test environments.
- Fuzzing: In the simplest definition, fuzz testing is essentially throwing lots of random garbage at applications, seeing if any specific type of garbage causes an application to error. Go to any security conference – BlackHat, Defcon, RSA or B-Sides – and the approach used by most security researchers to find vulnerable areas of code is fuzzing. Make no mistake, it’s key to identifying misbehaving code that may offer some exploitable weaknesses. Over the last 10 years, with Agile development processes and even more with DevOps, we are seeing a steady decline by development and QA teams in the use of fuzz testing. It’s because, to run through a large test body of possible malicious inputs, it takes a lot of time. This is a little less of an issue with web applications as attackers don’t have copies of the code, but much more problematic for applications delivered to users (e.g.: mobile apps, desktop applications, automobiles). The disparity of this change is alarming, and like pen testing, fuzz testing should be a periodic part of your security testing efforts. It can even be performed as a unit tests, or as component testing, in parallel to your normal QA efforts.
- Manual code review: Sure, some organizations find it more than a little scary to fully automate deployments, and they want a human to review changes before new code goes live; that’s understandable. But there are very good security reasons for doing it as well. In an environment as automation-centric as DevOps, it may seem antithetical to use or endorse manual code reviews or security inspection, but it is still a highly desirable addition. Manual reviews often catch obvious stuff that the tests miss, or a developer will miss on first pass. What’s more, not all developers are created equal in their ability to write security unit tests. Either through error or skill, people writing the tests miss stuff which manual inspections catch. Manual code inspections, at least period spot checks in new code, are something you’ll want to add to your repertoire.
- Vulnerability analysis: Some people equate vulnerability testing with DAST, but they can be different. Things like Heartbleed, misconfigured databases or Structs vulnerabilities may not be part of your application testing at all, but a critical vulnerability within your application stack. Some organizations scan application servers for vulnerabilities, typically as a credentialed user, looking for un-patched software. Some have pen testers probe for issues with their applications, looking for for weaknesses in configuration and places where security controls were not applied.
- Version controls: One of the nice side benefits of having build scripts serve both QA and production infrastructure is that Dev, Ops and QA are all in synch on the versions of code that they use. Still, someone on your team needs to monitor and provide version controls and updates to all parts of the application stack. For example, are those gem files up to date? As with vulnerability scanning above, you want the open source and commercial software you use should be monitored for new vulnerabilities, and task cards created to introduce patches into the build process. But many of the vulnerability analysis products don’t cover all of the bits and pieces that compose an application. This can be fully automated in house, having build scripts adjusted to pull the latest version, or you can integrate third party tools to do the monitoring and alerting. Either way, version control should now be part of your overall security monitoring program, with or without vulnerability analysis mentioned above.
- Runtime Protection: This is a new segment of the application security market. While the technical approaches are not new, over the past couple of years we’ve seen greater adoption of some run-time security tools that embed into applications for runtime threat protection. The names of the tools vary (real-time application scanning technologies (RAST), execution path monitoring, embedded application white listing) as do the deployment models (embedded runtime libraries, in-memory execution monitoring, virtualized execution paths), but they share the common goal of protecting applications by looking for attacks in runtime behavior. All of these platforms can be embedded into the build or runtime environment, all can monitor or block, and all adjust enforcement based upon the specifics of the application.
Integrating security findings from application scans into bug tracking systems is technically not that difficult. Most products have that as a built in feature. Actually figuring out what to do with that data once it’s obtained is. For any security vulnerability discovered, is it really a risk? If it is a risk and not a false positive, what is the priority relative to everything else that is going on? How is the information distributed? Now, with DevOps, you’ll need to close the loop on issues within the infrastructure as well as the code. And since Dev and Ops both offer potential solutions to most vulnerabilities, the people who manage security tasks need to ensure they include operations teams as well. Patching, code changes, blocking, functional white listing are potential methods to close security gaps, and as such, you’ll need both Dev and Ops to weigh the tradeoffs.
In the next post I am going back to the role of security within DevOps. And I will also be going back to pretty much all of the initial posts in this series as I have noted omissions that need to be rectified, and areas that I’ve failed to explain clearly. As always, comments and critique welcome!
Posted at Tuesday 6th October 2015 12:21 pm
(1) Comments •
This is one of those papers I’ve been wanting to write for a while. When I’m out working with clients, or teaching classes, we end up spending a ton of time on just how different networking is in the cloud, and how to manage it. On the surface we still see things like subnets and routing tables, but now everything is wired together in software, with layers of abstraction meant to look the same, but not really work the same.
This paper covers the basics and even includes some sample diagrams for Microsoft Azure and Amazon Web Services, although the bulk of the paper is cloud-agnostic.
From the report:
Over the last few decades we have been refining our approach to network security. Find the boxes, find the wires connecting them, drop a few security boxes between them in the right spots, and move on. Sure, we continue to advance the state of the art in exactly what those security boxes do, and we constantly improve how we design networks and plug everything together, but overall change has been incremental. How we think about network security doesn’t change – just some of the particulars.
Until you move to the cloud.
While many of the fundamentals still apply, cloud computing releases us from the physical limitations of those boxes and wires by fully abstracting the network from the underlying resources. We move into entirely virtual networks, controlled by software and APIs, with very different rules. Things may look the same on the surface, but dig a little deeper and you quickly realize that network security for cloud computing requires a different mindset, different tools, and new fundamentals.
Many of which change every time you switch cloud providers.
Special thanks to Algosec for licensing the research. As usual everything was written completely independently using our Totally Transparent Research process. It’s only due to these licenses that we are able to give this research away for free.
The landing page for the paper is here.
Direct download: Pragmatic Security for Cloud and Hybrid Networks (pdf)
Posted at Monday 5th October 2015 5:43 pm
(0) Comments •
By Adrian Lane
A couple housekeeping items before I begin today’s post - we’ve had a couple issues with the site so I apologize if you’ve tried to leave comments but could not. We think we have that fixed. Ping us if you have trouble.
Also, I am very happy to announce that Veracode has asked to license this research series on integrating security into DevOps! We are very happy to have them onboard for this one. And it’s support from the community and industry that allows us to bring you this type of research, and all for free and without registration.
For the sake of continuity I’ve decided to swap the order of posts from our original outline. Rather than discuss the role of security folks in a DevOps team, I am going to examine integration of security into code delivery processes. I think it will make more sense, especially for those new to DevOps, to understand the technical flow and how things fit together before getting a handle on their role.
Remember that DevOps is about joining Development and Operations to provide business value. The mechanics of this are incredibly important as it helps explain how the two teams work together, and that is what I am going to cover today.
Most of you reading this will be familiar with the concept of ‘nightly builds’, where all code checked in the previous day would be compiled overnight. And you’re just as familiar with the morning ritual of sipping coffee while you read through the logs to see if the build failed, and why. Most development teams have been doing this for a decade or more. The automated build is the first of many steps that companies go through on their way towards full automation of the processes that support code development. The path to DevOps is typically done in two phases: First with continuous integration, which manages the building an testing of code, and then continuous deployment, which assembles the entire application stack into an executable environment.
The essence of Continuous Integration (CI) is where developers check in small iterative advancements to code on a regular basis. For most teams this will involve many updates to the shared source code repository, and one or more ‘builds’ each day. The core idea is smaller, simpler additions where we can more easily - and more often - find defects in the code. Essentially these are Agile concepts, but implemented in processes that drive code instead of processes that drive people (e.g.: scrums, sprints). Definition of CI has morphed slightly over the last decade, but in context to DevOps, CI also implies that code is not only built and integrated with supporting libraries, but also automatically dispatched for testing as well. And finally CI in a DevOps context also implies that code modifications will not be applied to a branch, but into the main body of the code, reducing complexity and integration nightmares that plague development teams.
Conceptually, this sounds simple, but in practice it requires a lot of supporting infrastructure. It means builds are fully scripted, and the build process occurs as code changes are made. It means upon a successful build, the application stack is bundled and passed along for testing. It means that test code is built prior to unit, functional, regression and security testing, and these tests commence automatically when a new bundle is available. It also means, before tests can be launched, that test systems are automatically provisioned, configured and seeded with the necessary data. And these automation scripts must provide monitored for each part of the process, and that the communication of success or failure is sent back to Dev and Operations teams as events occur. The creation of the scripts and tools to make all this possible means operations, testing and development teams to work closely together. And this orchestration does not happen overnight; it’s commonly an evolutionary process that takes months to get the basics in place, and years to mature.
Continuous Deployment looks very similar to CI, but is focused on the release – as opposed to build – of software to end users. It involves a similar set of packaging, testing, and monitoring, but with some additional wrinkles. The following graphic was created by Rich Mogull to show both the flow of code, from check-in to deployment, and many of the tools that provide automation support.
Upon a successful completion of a CI cycle, the results feed the Continuous Deployment (CD) process. And CD takes another giant step forward in terms of automation and resiliency. CD continues the theme of building in tools and infrastructure that make development better _first, and functions second. CD addresses dozens of issues that plague code deployments, specifically error prone manual changes and differences in revisions of supporting libraries between production and dev. But perhaps most important is the use of the code and infrastructure to control deployments and rollback in the event of errors. We’ll go into more detail in the following sections.
This is far from a complete description, but hopefully you get enough of the basic idea of how it works. With the basic mechanics of DevOps in mind, let’s now map security in. The differences between what you do today should stand in stark contrast to what you do with DevOps.
Security Integration From An SDLC Perspective
Secure Development Lifecycle’s (SDLC), or sometimes called Secure Software Development Lifecycle’s, describe different functions within software development. Most people look at the different phases in an SDLC and think ‘Waterfall Development process’, which makes discussing SDLC in conjunction with DevOps seem convoluted. But there are good reasons for doing this; Architecture, design, development, testing and deployment phases of an SDLC map well to roles in the development organization regardless of development process, and they provide a jump-off point for people to take what they know today and morph that into a DevOps framework.
- Operational standards: Typically in the early phases of software development, you’re focused on the big picture of application architecture and how large functional pieces will work. With DevOps, you’re also weaving in the operational standards for the underlying environment. Just as with the code you deploy, you want to make small iterative improvements every day with your operational environment. This will include updates to the infrastructure (_e.g.: build automation tools, CI tools), but also policies for application stack security, including how patches are incorporated, version synchronization over the entire build chain, leveraging of tools and metrics, configuration management and testing. These standards will form the stories which are sent to the operations team for scripting during the development phase discussed below.
- Security functional requirements: What security tests will you run, which need to be run prior to deployment, and what tools are you going to use to get there. At a minimum, you’ll want to set security requirements for all new code, and what development team will need to test prior to certification. This could mean a battery of unit tests for specific threats your team must write to check for – as an example – the OWASP top ten list of vulnerabilities. Or you may choose commercial products: You have a myriad of security tools at your disposal and not all of them have APIs or capability to be fully integrated into DevOps. Similarly, many tests do not run as fast as your deployment cycle, so you have some difficult decision to make - more on parallel security testing below.
- Monitoring and metrics: If you’re going to make small iterative improvements with each release, what needs fixing? What’s going to slow? What is working and how do you prove it? Metrics are key to answering these questions. You will need to think about what data you want to collect, and build this into the CI and CD environment to measure how your scripts and testing perform. You’ll continually evolve the collection and use of metrics, but basic collection and dissemination of data should be in your plan from the get-go.
- Secure design/architecture: DevOps provides a means for significant advancements in security design and architecture. Most notably, since you’re goal is to automate patching and configurations for deployment, it’s possible to entirely disable administrative connections to production servers. Errors and misconfigurations are fixed in build and automation scripts, not through manual logins. Configuration, automated injection of certificates, automated patching and even pre-deployment validation are all possible. It’s also possible to completely disable network ports and access points commonly used for administration purposes, or what is a common attack vector. Leveraging deployment APIs form PaaS and IaaS cloud services, you have even more automation choices, which we will discuss later in this paper. DevOps offers a huge improvement to basic system security, but you specifically must design – or re-design – your deployments to leverage the advantages what automated CI and CD provide.
- Secure the deployment pipeline: With greater control over the both development and production environments, development and test servers become a more attractive target. Traditionally these environments are run with little or no security. But there is a greater need for security of source code management, build servers and deployment pipeline given the can possibly feed directly - and with minimal human intervention - directly into production. You’ll need to employ stricter controls over access to these systems, specifically build servers and code management. And given less human oversight of scripts running continuously in the background, you’ll need to ensure added monitoring so errors and misuse can be detected and corrected.
- Threat model: We maintain that threat modeling is one of the most productive exercises in security. DevOps does not change that. It does however open up opportunities for security team members to both instruct dev team members on common threat types, as well as help them plan unit tests for these type of attacks.
- Infrastructure and Automation First: You need tools before you can build a house, and you need a road before you drive a car somewhere. With DevOps, and specifically security within DevOps, integrating tools and building the tests are done before you begin developing the next set of features. We stress this point both because it makes planning more important, and it helps development plan for tools and test it needs to deploy before they can deliver new code. The bad news is that there is up front cost and work to be done; the good news is that each and every build now leverages the infrastructure and tools you’ve built.
- Automated and Validated: Remember, it’s not just development that is writing code and building scripts; operations is now up to their elbows in it as well. This is how DevOps helps the fields of patching and hardening to a new level. IT’s role in DevOps is to provide build scripts that build out the infrastructure needed for development, testing and production servers. The good news is that what works in testing should be identical to production. And automation help eliminate the problem traditional IT has faced for years: ad-hoc and undocumented work that runs months, or even years, behind on patching. Again, there is a lot of work to get this fully automated; servers, network configuration, applications and so on. Most teams we spoke with build new machine images every week, and update the scripts which apply patches, updating configurations and build scripts for different environments. But the work ensures consistency and secure baseline from which to start.
- Security Tasks: A core tenant of Continuous Integration is never check in broken or un-tested code. What constitutes broken or un-tested is up to you. Keep in mind that what’s happening here is, rather than write giant specification documents for code quality or security – like you used to do for waterfall – you’re documenting policies in functional scripts and programs. Unit tests and functional tests not only define, but enforce, security requirements.
- Security in the Scrum: As we mentioned in the last section, DevOps is process neutral. You can use spiral, or Agile, or surgical-team approaches as you wish. That said, the use of Agile Scrums and Kanban techniques are ideally suited for use with DevOps. The focus on smaller, focused, quickly demonstrable tasks are in natural alignment. For security tasks, they are no less important than any other structural or feature improvement. We recommend training at least one person on each team on security basics, and determine which team members have an interest in security topics to build in-house expertise. In this way security tasks can easily be distributed to those members who have interest and skill tackling security related problems.
- Strive for failure: In many ways DevOps turns long held principles - in both IT and software development - upside down. Durability used to mean ‘uptime’, and now it’s the speed of replacement. Detailed specifications were used to coordinate dev teams, now it’s a post-it notes. Quality assurance focused on getting code to pass functional requirements, now it looks for ways to break an application before someone else can. It’s this latter point change in approach which really helps raise the bar on security. Stealing a line from James Wickett’s Gauntlt page: “Be Mean To Your Code - And Like It” embodies the ideal. Not only is the goal to build security tests into the automated delivery process, but to greatly raise the bar on what is acceptable to release. We harden an application by intentionally pummeling it with all sorts of functional, stress and security tests before code goes live, reducing the time of hands-on security experts testing code. If you can figure out some way to break your application, odds are attacker can too, so build the test – and the remedy – before code goes live.
- Parallelize Security Testing: A problem with all agile development approaches is what to do about tests that take longer than the development cycle? For example, we know that fuzz testing critical pieces of code takes longer in duration than you average sprint within an Agile development model. DevOps is no different in this regard; with CI and CD, as code may delivered to users within hours of being created, it is simply not possible to perform white-box or dynamic code scans during this window of time. To help address this issue, DevOps teams run multiple security tests in parallel. Validation against known critical issues are written as unit tests to perform a quick spot check, with failures kicking code back to the development team. Code scanners are commonly run in parallel, against periodic – instead of every – release. The results are also sent back to development, and similarly identify the changes that created the vulnerability, but these tests commonly do not gate a release. How to deal with these issues caused headaches in every dev team we spoke with. Focusing scans on specific areas of the code helps find issues faster, and minimizes the disruption from lagging tests, but remains an area security and development team members struggle with.
- Manual vs. Automated deployment: It’s easy enough to push new code into production. Vetting that code, or rolling back in the events of errors is always tricky. Most teams we spoke with are not at the stage where they are fully comfortable with fully automated deployments. In fact many still only release new code to their customers every few weeks, often in conjunction with the end of a recent sprint. For these companies most actions are executed through scripts, but the scripts are run manually, when IT and development resources can be on hand to fully monitor the code push. A handful of organizations are fully comfortable with fully-automated pushes to production, and release code several times a day. There is no right answer here, but in either case, automation performs the bulk of the work, freeing others up to test and monitor.
- Deployment and Rollback: To double-check that code which worked in pre-deployment tests still works in the development environment, the teams we spoke with still do ‘smoke’ tests, but they have evolved these tests to incorporate automation and more granular control over the rollouts. In fact, we typically saw three tricks used to augment deployment. The first, and most powerful, of these techniques is called Blue-Green – or Red-Black – deployments. Simply put, old code and new code run side by side, each on their own set of servers. Rollout is done with a simple redirection from load balancers, and in the event errors are discovered, load balancers are re-directed back to the old code. The second, canary testing, is where a select number of individual sessions are directed towards the new code; first employee testers, then a subset of real customers. If the canary dies (i.e.: any errors encountered), the new code is retired until the issued can be fixed, and the process is repeated. And finally, feature tagging, where new code elements are enabled or disabled through configuration files. In the event errors are discovered in a new section of code, the feature can be toggled off, and the code replaced when fixed. The degree of automation and human intervention varies greatly, but overall, these deployments are far more automated that traditional web services environments.
- Product Security Tests: With the above mentioned deployment models built into release management scripts, it’s fairly easy to have the ‘canaries’ be dynamic code scanners, pen testers or other security oriented testers. When coupled with test accounts specifically used for somewhat invasive security tests, the risk of data corruption is lowered, while still allowing security tests to be performed in the production environment.
I’ve probably missed a few in this discussion, so please feel free to contribute any ideas you feel should be discussed.
In the next post, I am again going to shake up the order in this series, and talk about tools and testing in greater detail. Specifically I will construct a security tool chain for addressing different types of threats, and showing how these fit within DevOps processes.
Posted at Saturday 3rd October 2015 5:20 pm
(0) Comments •
This is the fourth post in a new series I’m posting for public feedback, licensed by Algosec. Well, that is if they like it – we are sticking to our Totally Transparent Research policy. I’m also live-writing the content on GitHub if you want to provide any feedback or suggestions. Click here for the first post in the series, [here for post two](https://securosis.com/blog/pragmatic-security-for-cloud-and-hybrid-networks-cloud-networking-101, post 3, post 4.
To finish off this research it’s time to show what some of this looks like. Here are some practical design patterns based on projects we have worked on. The examples are specific to Amazon Web Services and Microsoft Azure, rather than generic templates. Generic patterns are less detailed and harder to explain, and we would rather you understand what these look like in the real world.
Basic Public Network on Microsoft Azure
This is a simplified example of a public network on Azure. All the components run on Azure, with nothing in the enterprise data center, and no VPN connections. Management of all assets is over the Internet. We can’t show all the pieces and configuration settings in this diagram, so here are some specifics:
- The Internet Gateway is set in Azure by default (you don’t need to do anything). Azure also sets up default service endpoints for the management ports to manage your instances. These connections are direct to each instance and don’t run through the load balancer. They will (should) be limited to only your current IP address, and the ports are closed to the rest of the world. In this example we have a single public facing subnet.
- Each instance gets a public IP address and domain name, but you can’t access anything that isn’t opened up with a defined service endpoint. Think of the endpoint as port forwarding, which it pretty much is.
- The service endpoint can point to the load balancer, which in turn is tied to the auto scale group. You set rules on instance health, performance, and availability; the load balancer and auto scale group provision and deprovision servers as needed, and handle routing. The IP addresses of the instances change as these updates take place.
- Network Security Groups (NSGs) restrict access to each instance. In Azure you can also apply them to subnets. In this case we would apply them on a per-server basis. Traffic would be restricted to whatever services are being provided by the application, and would
deny traffic between instances on the same subnet. Azure allows such internal traffic by default, unlike Amazon.
- NSGs can also restrict traffic to the instances, locking it down to only from the load balancer and thus disabling direct Internet access. Ideally you never need to log into the servers because they are in an auto scale group, so you can also disable all the management/administration ports.
There is more, but this pattern produces a hardened server, with no administrative traffic, protected with both Azure’s default protections and Network Security Groups. Note that on Azure you are often much better off using their PaaS offerings such as web servers, instead of manually building infrastructure like this.
Basic Private Network on Amazon Web Services
Amazon works a bit differently than Azure (okay – much differently). This example is a Virtual Private Cloud (VPC, their name for a virtual network) that is completely private, without any Internet routing, connected to a data center through a VPN connection.
- This shows a class B network with two smaller subnets. In AWS you would place each subnet in a different Availability Zone (what we called a ‘zone’) for resilience in case one goes down – they are separate physical data centers.
- You configure the VPN gateway through the AWS console or API, and then configure the client side of the VPN connection on your own hardware. Amazon maintains the VPN gateway in AWS; you don’t directly touch or maintain it, but you do need to maintain everything on your side of the connection (and it needs to be a hardware VPN).
- You adjust the routing table on your internal network to send all traffic for the 10.0.0.0/16 network over the VPN connection to AWS. This is why it’s called a ‘virtual’ private cloud. Instances can’t see the Internet, but you have that gateway that’s Internet accessible.
- You also need to set your virtual routing table in AWS to send Internet traffic back through your corporate network if you want any of your assets to access the Internet for things like software updates. Sometimes you do, sometimes you don’t – we don’t judge.
- By default instances are protected with a Security Group that denies all inbound traffic and allows all outbound traffic. Unlike in Azure, instances on the same subnet can’t talk to each other. You cannot connect to them through the corporate network until you open them up. AWS Security Groups offer
allow rules only. You cannot explicitly deny traffic – only open up allowed traffic. In Azure you create Service Endpoints to explicitly route traffic, then use network security groups to allow or deny on top of that (within the virtual network). AWS uses security groups for both functions – opening a security group allows traffic through the private IP (or public IP if it is public facing).
- Our example uses no ACLs but you could put an ACL in place to block the two subnets from talking to each other. ACLs in AWS are there by default, but allow all traffic. An ACL in AWS is not stateful, so you need to create rules for all bidrectional traffic. ACLs in AWS work better as a
- A public network on AWS looks relatively similar to our Azure sample (which we designed to look similar). The key differences are how security groups and service endpoints function.
Hybrid Cloud on Azure
This builds on our previous examples. In this case the web servers and app servers are separated, with app servers on a private subnet. We already explained the components in our other examples, so there is only a little to add:
- The key security control here is a Network Security Group to restrict access to the app servers from ONLY the web servers, and only to the specific port and protocol required.
- The NSG should be applied to each instance, not to the subnets, to prevent a “flat network” and block peer traffic that could be used in an attack.
- The app servers can connect to your datacenter, and that is where you route all Internet traffic. That gives you just as much control over Internet traffic as with virtual machines in your own data center.
- You will want to restrict traffic from your organization’s network to the instances (via the NSGs) so you don’t become the weak link for an attack.
A Cloud Native Data Analytics Architecture
Our last example shows how to use some of the latest features of Amazon Web Services to create a new cloud-native design for big data transfers and analytics.
- In this example there is a private subnet in AWS, without either Internet access or a connection to the enterprise data center. Images will be created in either another account or a VPC, and nothing will be manually logged into.
- When an analytics job is triggered, a server in the data center takes the data and sends it to Amazon S3, their object storage service, using command line tools or custom code. This is an encrypted connection by default, but you could also encrypt the data using the AWS Key Management Service (or any encryption tool you want). We have clients using both options.
- The S3 bucket in AWS is tightly restricted to either only the IP address of the sending server, or a set of AWS IAM credentials – or both. AWS manages S3 security so you don’t worry about network attacks, merely enable access. S3 isn’t like a public FTP server – if you lock it down (easy to do) it isn’t visible except from authorized sources.
- A service called AWS Lambda monitors the S3 bucket. Lambda is a container for event-driven code running inside Amazon that can trigger based on internal things, including a new file appearing in an S3 bucket. You only pay for Lambda when your code is executing, so there is no cost to have it wait for events.
- When a new file appears the Lambda function triggers and launches analysis instances based on a standard image. The analysis instances run in a private subnet, with security group settings that block all inbound access.
- When the analysis instances launch the Lambda code sends them the location of the data in S3 to analyze. The instances connect to S3 through something known as a VPC Endpoint, which is totally different from an Azure service endpoint. A VPC endpoint allows instances in a totally private subnet to talk to S3 without Internet access (which was required until recently). As of this writing only S3 has a VPC endpoint, but we know Amazon is working on endpoints for additional services such as their Simple Queue Service (we suspect AWS hasn’t confirmed exactly which services are next on the list).
- The instances boot, grab the data, then do their work. When they are done they go through the S3 VPC Endpoint to drop their results into a second S3 bucket.
- The first bucket only allows writes from the data center, and reads from the private subnet. The second bucket reverses that and only allows reads from the data center and writes from the subnet. Everything is a one-way closed loop.
- The instance can then trigger another Lambda function to send a notification back to your on-premise data center or application that the job is complete, and code in the data center can grab the results. There are several ways to do this – for example the results could go into a database, instead.
- Once everything is complete Lambda moves the original data into Glacier, Amazon’s super-cheap long-term archival storage. In this scenario it is of course encrypted. (For this network-focused research we are skipping over most of the encryption options for this architecture, but they aren’t overly difficult).
Think about what we have described: the analysis servers have no Internet access, spin up only as needed, and can only read in new data and write out results. They automatically terminate when finished, so there is no persistent data sitting unused on a server or in memory. All Internet-facing components are native Amazon services, so we don’t need to maintain their network security. Everything is extremely cost-effective, even for very large data sets, because we only process when we need it; big data sets are always stored in the cheapest option possible, and automatically shifted around to minimize storage costs. The system is event-driven so if you load 5 jobs at once, it runs all 5 at the same time without any waiting or slowdown, and if there are no jobs the components are just programmatic templates, in the absolute most cost-effective state.
This example does skip some options that would improve resiliency in exchange for better network security. For example we would normally recommend using Simple Queue Service to manage the jobs (Lambda would send them over), because SQS handles situations such as an instance failing partway through processing. But this is security research, not availability focused.
This research isn’t the tip of the iceberg; it’s more like the first itty bitty little ice crystal on top of an iceberg, which stretches to the depths of the deepest ocean trench. But if you remember the following principles you will be fine as you dig into securing your own cloud and hybrid deployments:
- The biggest differences between cloud and traditional networks is the combination of abstraction (virtualization) and automation. Things look the same but don’t function the same.
- Everything is managed by software, providing tremendous flexibility, and enabling you to manage network security using the exact same tools that Development and Operations use to manage their pieces of the puzzle.
- You can achieve tremendous security through architecture. Virtual networks (and multiple cloud accounts) support incredible degrees of compartmentalization, where every project has its own dedicated network or networks.
- Security groups enhance that by providing the granularity of host firewalls, without the risks of relying on operating systems. They provide better manageability than even most network firewalls.
- Platform as a Service and cloud-provider-specific services open up entirely new architectural options. Don’t try to build things the way you always have. Actually, if you find yourself doing that, you should probably rethink your decision to use the cloud.
Don’t be intimidated by cloud computing, but don’t think you can or should implement network security the way you always have. Your skills and experiences are still important, and provide a base to build on as you learn all the new options available within the cloud.
Posted at Tuesday 29th September 2015 2:03 pm
(0) Comments •
This is the fourth post in a new series I’m posting for public feedback, licensed by Algosec. Well, that is if they like it – we are sticking to our Totally Transparent Research policy. I’m also live-writing the content on GitHub if you want to provide any feedback or suggestions. Click here for the first post in the series, [here for post two](https://securosis.com/blog/pragmatic-security-for-cloud-and-hybrid-networks-cloud-networking-101, post 3).
There is no single ‘best’ way to secure a cloud or hybrid network. Cloud computing is moving faster than any other technology in decades, with providers constantly struggling to out-innovate each other with new capabilities. You cannot lock yourself into any single architecture, but instead need to build out a program capable of handling diverse and dynamic needs.
There are four major focus areas when building out this program.
- Start by understanding the key considerations for the cloud platform and application you are working with.
- Design the network and application architecture for security.
- Design your network security architecture including additional security tools (if needed) and management components.
- Manage security operations for your cloud deployments – including everything from staffing to automation.
Understand Key Considerations
Building applications in the cloud is decidedly not the same as building them on traditional infrastructure. Sure, you can do it, but the odds are high something will break. Badly. As in “update that resume” breakage. To really see the benefits of cloud computing, applications must be designed specifically for the cloud – including security controls.
For network security this means you need to keep a few key things in mind before you start mapping out security controls.
- Provider-specific limitations or advantages: All providers are different. Nothing is standard, and don’t expect it to ever become standard. One provider’s security group is another’s ACL. Some allow more granular management. There may be limits on the number of security rules available. A provider might offer both allow and deny rules, or allow only. Take the time to learn the ins and outs of your provider’s capabilities. They all offer plenty of documentation and training, and in our experience most organizations limit themselves to no more than one to three infrastructure providers, keeping the problem manageable.
- Application needs: Applications, especially those using the newer architectures we will mention in a moment, often have different needs than applications deployed on traditional infrastructure. For example application components in your private network segment may still need Internet access to connect to a cloud component – such as storage, a message bus, or a database. These needs directly affect architectural decisions – both security and otherwise.
- New architectures: Cloud applications use different design patterns than apps on traditional infrastructure. For example, as previously mentioned, components are typically distributed across diverse network locations for resiliency, and tied tightly to cloud-based load balancers. Early cloud applications often emulated traditional architectures but modern cloud applications make extensive use of advanced cloud features, particularly Platform as a Service, which may be deeply integrated into a particular cloud provider. Cloud-based databases, message queues, notification systems, storage, containers, and application platforms are all now common due to cost, performance, and agility benefits. You often cannot even control the network security of these services, which are instead fully managed by the cloud provider. Continuous deployment, DevOps, and immutable servers are the norm rather than exceptions. On the upside, used properly these architectures and patterns are far more secure, cost effective, resilient, and agile than building everything yourself, but you do need to understand how they work.
Data Analytics Design Pattern Example
A common data analytics design pattern highlights these differences (see the last section for a detailed example). Instead of keeping a running analytics pool and sending it data via SFTP, you start by loading data into cloud storage directly using an (encrypted) API call. This, using a feature of the cloud, triggers the launch of a pool of analytics servers and passes the job on to a message queue in the cloud. The message queue distributes the jobs to the analytics servers, which use a cloud-based notification service to signal when they are done, and the queue automatically redistributes failed jobs. Once it’s all done the results are stored in a cloud-based NoSQL database and the source files are archived. It’s similar to ‘normal’ data analytics except everything is event-driven, using features and components of the cloud service. This model can handle as many concurrent jobs as you need, but you don’t have anything running or racking up charges until a job enters the system.
- Elasticity and a high rate of change are standard in the cloud: Beyond auto scaling, cloud applications tend to alter the infrastructure around them to maximize the benefits of cloud computing. For example one of the best ways to update a cloud application is not to patch servers, but instead to create an entirely new installation of the app, based on a template, running in parallel; and then to switch traffic over from the current version. This breaks familiar security approaches, including relying on IP addresses for: server identification, vulnerability scanning, and logging. Server names and addresses are largely meaningless, and controls that aren’t adapted for cloud are liable to be useless.
- Managing and monitoring security changes: You either need to learn how to manage cloud security using the provider’s console and APIs, or choose security tools that integrate directly. This may become especially complex if you need to normalize security between your data center and cloud provider when building a hybrid cloud. Additionally, few cloud providers offer good tools to track security changes over time, so you will need to track them yourself or use a third-party tool.
Design the Network Architecture
Unlike traditional networks, security is built into cloud networks by default. Go to any major cloud provider, spin up a virtual network, launch a server, and the odds are very high it is already well-defended – with most or all access blocked by default.
Because security and core networking are so intertwined, and every cloud application has its own virtual network (or networks), the first step toward security is to work with the application team and design it into the architecture.
Here are some specific guidelines and recommendations:
- Accounts provide your first layer of segregation. With each cloud provider you establish multiple accounts, each for a different environment (e.g., dev, test, production, logging). This enables you to tailor cloud security controls and minimize administrator access. This isn’t a purely network security feature, but will affect network security because you can, for example, have tighter controls for environments closer to production data. The rule of thumb for accounts is to consider separate accounts for separate applications, and then separate accounts for a given application when you want to restrict how many people have administrator access. For example a dev account is more open with more administrators, while production is a different account with a much smaller group of admins. Within accounts, don’t forget about the physical architecture:
- Regions/locations are often used for resiliency, but may also be incorporated into the architecture for data residency requirements, or to reduce network latency to customers. Unlike accounts, we don’t normally use locations for security, but you do need to build network security within each location.
- Zones are the cornerstone of cloud application resiliency, especially when tied to auto scaling. You won’t use them as a security control, but again they affect security, as they often map directly to subnets. An auto scale group might keep multiple instances of a server in different zones, which are different subnets, so you cannot necessarily rely on subnets and addresses when designing your security.
- Virtual Networks (Virtual Private Clouds) are your next layer of security segregation. You can (and will) create and dedicate separate virtual networks for each application (potentially in different accounts), each with its own set of network security controls. This compartmentalization offers tremendous security advantages, but seriously complicates security management. It forces you to rely much more heavily on automation, because manually replicating security controls across accounts and virtual networks within each account takes tremendous discipline and effort. In our experience the security benefits of compartmentalization outweigh the risks created by management complexity – especially because development and operations teams already tend to rely on automation to create, manage, and update environments and applications in the first place. There are a few additional non-security-specific aspects to keep in mind when you design the architecture:
- Within a given virtual network, you can include public and private facing subnets, and connect them together. This is similar to DMZ topologies, except public-facing assets can still be fully restricted from the Internet, and private network assets are all by default walled off from each other. Even more interesting, you can spin up totally isolated private network segments that only connect to other application components through an internal cloud service such as a message queue, and prohibit all server-to-server traffic over the network.
- There is no additional cost to spin up new virtual networks (or at least if your provider charges for this, it’s time to move on), and you can create another with a few clicks or API calls. Some providers even allow you to bridge across virtual networks, assuming they aren’t running on the same IP address range. Instead of trying to lump everything into one account and one virtual network, it makes far more sense to use multiple networks for different applications, and even within a given application architecture.
- Within a virtual network you also have complete control over subnets. While they may play a role in your security design, especially as you map out public and private network segments, make sure you also design them to support zones for availability.
- Flat networks aren’t flat in the cloud. Everything you deploy in the virtual network is surrounded by its own policy-based firewall which blocks all connections by default, so you don’t need to rely on subnets themselves as much for segregation between application components. Public vs. private subnets are one thing, but creating a bunch of smaller subnets to isolate application components quickly leads to diminishing returns.
You may need enterprise datacenter connections for hybrid clouds. These VPN or direct connections route traffic directly from your data center to the cloud, and vice-versa. You simply set your routing tables to send traffic to the appropriate destination, and SDN-based virtual networks allow you to set distinct subnet ranges to avoid address conflicts with existing assets.
Whenever possible, we actually recommend avoiding hybrid cloud deployments. It isn’t that there is anything wrong with them, but they make it much more difficult to support account and virtual network segregation. For example if you use separate accounts or virtual networks for your different dev/test/prod environments, you will tend to do so using templates to automatically build out your architecture, and they will perfectly mimic each other – down to individual IP addresses. But if you connect them directly to your data center you need to shift to non-overlapping address ranges to avoid conflicts, and they can’t be as automated or consistent. (This consistency is a cornerstone of continuous deployment and DevOps).
Additionally, hybrid clouds complicate security. We have actually seen them, not infrequently, reduce the overall security level of the cloud, because assets in the datacenter aren’t as segregated as on the cloud network, and cloud providers tend to be more secure than most organizations can achieve in their own infrastructure. Instead of cracking your cloud provider, someone only needs to crack a system on your corporate network, and use that to directly bridge to the cloud.
So when should you consider a hybrid deployment? Any time your application architecture requires direct address-based access to an internal asset that isn’t Internet-accessible. Alternatively, sometimes you need a cloud asset on a static, non-Internet-routable address – such as an email server or other service that isn’t designed to work with auto scaling – which internal things need to connect to. (We strongly recommend you minimize these – they don’t benefit from cloud computing, so there usually isn’t a good reason to deploy them there). And yes, this means hybrid deployments are extremely common unless you are building everything from scratch. We try to minimize their use – but that doesn’t mean they don’t play a very important role.
For security there are a few things to keep in mind when building a hybrid deployment:
- VPN traffic will traverse the Internet. VPNs are very secure, but you do need to keep them up-to-date with the latest patches and make sure you use strong, up-to-date certificates.
- Direct connections may reduce latency, but decide whether you trust your network provider, and whether you need to encrypt traffic.
- Don’t let your infrastructure reduce the security of your cloud. If you mandate multi-factor authentication in the cloud but not on your LAN, that’s a loophole. Is your entire LAN connected to the cloud? Could someone compromise a single workstation and then start attacking your cloud through your direct connection? Do you have security group or other firewall rules to keep your cloud assets as segregated from datacenter assets as they are from each other? Remember, cloud providers tend to be exceptionally good at security, and everything you deploy in the cloud is isolated by default. Don’t allow hybrid connection to become the weak link and reduce this compartmentalization.
- You may still be able to use multiple accounts and virtual networks for segregation, by routing different datacenter traffic to different accounts and/or virtual networks. But your on-premise VPN hardware or your cloud provider might not support this, so check before building it into your architecture.
- Cloud and on-premise network security controls may look similar on the surface, but they have deep implementation differences. If you want unified management you need to understand these differences, and be able to harmonize based on security goals – not by trying to force a standard implementation across very different technologies.
- Cloud computing offers many more ways to integrate into your existing operations than you might think. For example instead of using SFTP and setting up public servers to receive data dumps, consider installing your cloud provider’s command-line tools and directly transferring data to their object storage service (fully locked down, of course). Now you don’t need to maintain the burden of either an Internet-accessible FTP server or a hybrid cloud connection.
It’s hard to fully convey the breadth and depth of options for building security into your architectures, even without additional security tools. This isn’t mere theory – we have a lot of real-world experience with different architectures creating much higher security levels than can be achieved on traditional infrastructure at any reasonable cost.
Design the Network Security Architecture
At this point you should have a well-segregated environment where effectively every application, and every environment (e.g., dev/test) for every application, is running on its own virtual network. These assets are mostly either in auto scale groups which spread them around zones and subnets for resiliency; or connect to secure cloud services such as databases, message queues, and storage. These architectures alone, in our experience, are materially more secure than your typical starting point on traditional infrastructure.
Now it’s time to layer on the additional security controls we covered earlier under Cloud Networking 101. Instead of repeating the pros and cons, here are some direct recommendations about when to use each option:
- Security groups: These should be used by default, and set to
deny by default. Only open up the absolute minimum access needed. Cloud services allow you to right-size resources far more easily than on your own hardware, so we find most organizations tend to deploy far fewer number services on each instance, which directly translates to opening up fewer network ports per instance. A large number of cloud deployments we have evaluated use only a good base architecture and security groups for network security.
- ACLs: These mostly make sense in hybrid deployments, where you need to closely match or restrict communications between the data center and the cloud. Security groups are usually a better choice, and we only recommend falling back to ACLs or subnet-level firewalling when you cannot achieve your security objectives otherwise.
- Virtual Appliances: Whenever you need capabilities beyond basic firewalls, this is where you are likely to end up. But we find host agents often make more sense when they offer the same capabilities, because virtual appliances become costly bottlenecks which restrict your cloud architecture options. Don’t deploy one merely because you have a checkbox requirement for a particular tool – ensure it makes sense first. Over time we do see them becoming more “cloud friendly”, but when we rip into requirements on projects, we often find there are better, more cloud-appropriate ways to meet the same security objectives.
- Host security agents are often a better option than a virtual appliance because they don’t restrict virtual networking architectural options. But you need to ensure you have a way to deploy them consistently. Also, make sure you pick cloud-specific tools designed to work with features such as auto scaling. These tools are particularly useful to cover network monitoring gaps, meet IDS/IPS requirements, and satisfy all your normal host security needs.
Of course you will need some way of managing these controls, even if you stick to only capabilities and features offered by your cloud provider.
Security groups and ACLs are managed via API or your cloud provider’s console. They use the same management plane as the rest of the cloud, but this won’t necessarily integrate out of the box with the way you manage things internally. You can’t track these across multiple accounts and virtual networks unless you use a purpose-built tool or write your own code. We will talk about specific techniques for management in the next section, but make sure you plan out how to manage these controls when you design your architecture.
Platform as a Service introduces its own set of security differences. For example in some situations you still define security groups and/or ACLs for the platform (as with a cloud load balancer); but in other cases access to the platform is only via API, and may require an outbound public Internet connection, even from a private network segment. PaaS also tends to rely more on DNS rather than IP addresses, to help the cloud provider maintain flexibility. We can’t give you any hard and fast rules here. Understand what’s required to connect to the platform, and then ensure your architecture allows those connections. When you can manage security treat it like any other cluster of servers, and stick with the minimum privileges possible.
We cannot cover anything near every option for every cloud in a relatively short (believe it or not) paper like this, but for the most part once you understand these fundamentals and the core differences of working in software-defined environments, it gets much easier to adapt to new tools and technologies.
Especially once you realize that you start by integrating security into the architecture, instead of trying to layer it on after the fact.
Manage Cloud (and Hybrid) Network Security Operations
Building in security is one thing, but keeping it up to date over time is an entirely different – and harder – problem. Not only do applications and deployments change over time, but cloud providers have this pesky habit of “innovating” for “competitive advantage”. Someday things might slow down, but it definitely won’t be within the lifespan of this particular research.
Here are some suggestions on managing cloud network security for the long haul.
Organization and Staffing
It’s a good idea to make sure you have cloud experts on your network security team, people trained for the platforms you support. They don’t need to be new people, and depending on your scale this doesn’t need to be their full-time focus, but you definitely need the skills. We suggest you build your team with both security architects (to help in design) and operators (to implement and fix).
Cloud projects occur outside the constraints of your data center, including normal operations, which means you might need to make some organizational changes so security is engaged in projects. A security representative should be assigned and integrated into each cloud project. Think about how things normally work – someone starts a new project and security gets called when they need access or firewall rule changes. With cloud computing network security isn’t blocking anything (unless they need access to an on-premise resource) and entire projects can happen without security or ops every being directly involved. You need to adapt policies and organizational structure to minimize this risk. For example, work with procurement to require a security evaluation and consultation before any new cloud account is opened.
Because so much of cloud network security relies on architecture, it isn’t just important to have a security architect on the team – it is essential they be engaged in projects early. It goes without saying that this needs to be a collaborative role. Don’t merely write up some pre-approved architectures, and then try to force everyone to work within those constraints. You’ll lose that fight before you even know it started.
We hinted at this in the section above: one of the first challenges is to find all the cloud projects, and then keep finding new ones as they pop up over time. You need to enumerate the existing cloud network security controls. Here are a couple ways we have seen clients successfully keep tabs on cloud computing:
- If your critical assets (such as the customer database) are well locked down, you can use this to control cloud projects. If they want access to the data/application/whatever, they need to meet your security requirements.
- Procurement and Accounting are your next best options. At some point someone needs to pay the (cloud) piper, and you can work with Accounting to identify payments to cloud providers and tie them back to the teams involved. Just make sure you differentiate between those credit card charges to Amazon for office supplies, and the one to replicate your entire datacenter up into AWS.
- Hybrid connections to your data center are pretty easy to track using established process. Unless you let random employees plug in VPN routers.
- Lastly, we suppose you could try setting a policy that says “don’t cloud without telling us”. I mean, if you trust your people and all. It could work. Maybe. It’s probably good to have to keep the auditors happy anyway.
The next discovery challenge is to figure out how the cloud networks are architected and secured:
- First, always start with the project team. Sit down with them and perform an architecture and implementation review.
- It’s a young market, but there are some assessment tools that can help. Especially to analyze security groups and network security, and compare against best practices.
- You can use your cloud provider’s console in many cases, but most of them don’t provide a good overall network view. If you don’t have a tool to help, you can use scripts and API calls to pull down the raw configuration and manually analyze it.
Integrating with Development
In the broadest, sense there are two kinds of cloud deployments: applications you build and run in the cloud (or hybrid), and core infrastructure (like file and mail servers) you transition to the cloud. Developers play the central role in the former, but they are also often involved in the latter.
The cloud is essentially software defined everything. We build and manage all kinds of cloud deployments using code. Even if you start by merely transitioning a few servers into virtual machines at a cloud provider, you will always end up defining and managing much of your environment in code.
This is an incredible opportunity for security. Instead of sitting outside the organization and trying to protect things by building external walls, we gain much greater ability to manage security using the exact same tools development and operations use to define, build, and run the infrastructure and services. Here are a few key ways to integrate with development and ensure security is integrated:
- Create a handbook of design patterns for the cloud providers you support, including security controls and general requirements. Keep adding new patterns as you work on new projects. Then make this library available to business units and development teams so they know which architectures already have general approval from security.
- A cloud security architect is essential, and this person or team should engage early with development teams to help build security into every initial design. We hate to have to say it, but their role really needs to be collaborative. Lay down the law with a bunch of requirements that interfere with the project’s execution, and you definitely won’t be invited back to the table.
- A lot of security can be automated and templated by working with development. For example monitoring and automation code can be deployed on projects without the project team having to develop them from scratch. Even integrating third party tools can often be managed programmatically.
Change is constant in cloud computing. The foundational concept dynamic adjustment of capacity (and configuration) to meet changing demands. When we say “enforce policies” we mean that, for a given project, once you design the security you are able to keep it consistent. Just because clouds change all the time doesn’t mean it’s okay to let a developer drop all the firewalls by mistake.
The key policy enforcement difference between traditional networking and the cloud is that in traditional infrastructure security has exclusive control over firewalls and other security tools. In the cloud, anyone with sufficient authorization in the cloud platform (management plane) can make those changes. Even applications can potentially change their own infrastructure around them. That’s why you need to rely on automation to detect and manage change.
You lose the single point of control. Heck, your own developers can create entire networks from their desktops. Remember when someone occasionally plugged in their own wireless router or file server? It’s a bit like that, but more like building their own datacenter over lunch. Here are some techniques for managing these changes:
- Use access controls to limit who can change what on a given cloud project. It is typical to allow developers a lot of freedom in the dev environment, but lock down any network security changes in production, using your cloud provider’s IAM features.
- To the greatest extent possible, try to use cloud provider specific templates to define your infrastructure. These files contain a programmatic description of your environment, including complete network and network security configurations. You load them into the cloud platform and it builds the environment for you. This is a very common way to deploy cloud applications, and essential in organizations using DevOps to enforce consistency.
- When this isn’t possible you will need to use a tool or manually pull the network architecture and configuration (including security) and document them. This is your baseline.
- Then you need to automate change monitoring using a tool or the features of your cloud and/or network security provider:
- Cloud platforms are slowly adding monitoring and alerting on security changes, but these capabilities are still new and often manual. This is where cloud-specific training and staffing can really pay off, and there are also third-party tools to monitor these changes for you.
- When you use virtual appliances or host security, you don’t rely on your cloud provider, so you may be able to hook change management and policy enforcement into your existing approaches. These are security-specific tools, so unlike cloud provider features the security team will often have exclusive access and be responsible for making changes themselves.
- Did we mention automation? We will talk about it more in a minute, because it’s the only way to maintain cloud security.
Normalizing On-Premise and Cloud Security
Organizations have a lot of security requirements for very good reasons, and need to ensure those controls are consistently applied. We all have developed a tremendous amount of network security experience over decades running our own networks, which is still relevant when moving into the cloud. The challenge is to carry over the requirements and experience, without assuming everything is the same in the cloud, or letting old patterns prevent us from taking full advantage of cloud computing.
- Start by translating whatever rules sets you have on-premise into a comparable version for the cloud. This takes a few steps:
- Figure out which rules should still apply, and what new rules you need. For example a policy to deny all
ssh traffic from the Internet won’t work if that’s how you manage public cloud servers. Instead a policy that limits
ssh access to your corporate CIDR block makes more sense. Another example is the common restriction that back-end servers shouldn’t have any Internet access at all, which may need updating if they need to connect to PaaS components of their own architecture.
- Then adjust your policies into enforceable rulesets. For example security groups and ACLs work differently, so how you enforce them changes. Instead of setting subnet-based policies with a ton of rules, tie security group policies to instances by function. We once encountered a client who tried to recreate very complex firewall rulesets into security groups, exceeding their provider’s rule count limit. Instead we recommended a set of policies for different categories of instances.
- Watch out for policies like “deny all traffic from this IP range”. Those can be very difficult to enforce using cloud-native tools, and if you really have those requirements you will likely need a network security virtual appliance or host security agent. In many projects we find you can resolve the same level of risk with smarter architectural decisions (e.g., using immutable servers, which we will describe in a moment).
- Don’t just drop in a virtual appliance because you are used to it and know how to build its rules. Always start with what your cloud provider offers, then layer on additional tools as needed.
- If you migrate existing applications to the cloud the process is a bit more complex. You need to evaluate existing security controls, discover and analyze application dependencies and network requirements, and then translate them for a cloud deployment, taking into account all the differences we have been discussing.
- Once you translate the rules, normalize operations. This means having a consistent process to deploy, manage, and monitor your network security over time. Fully covering this is beyond to scope of this research, as it depends on how you manage network security operations today. Just remember that you are trying to blend what you do now with what the cloud project’s requirements, not simply enforce your existing processes under an entirely new operating model.
We hate to say it, but we will – this is a process of transition. We find customers who start on a project-by-project basis are more successful, because they can learn as they go, and build up a repository of knowledge and experience.
Automation and Immutable Network Security
Cloud security automation isn’t merely fodder for another paper – it’s an entirely new body of knowledge we are all only just beginning to build.
Any organization that moves to the cloud in any significant way learns quickly that automation is the only way to survive. How else can you manage multiple copies of a single project in different environments – never mind dozens or hundreds of different projects, each running in their own sets of cloud accounts across multiple providers?
Then, keep all those projects compliant with regulatory requirements and your internal security policies.
Yeah, it’s like that.
Fortunately this isn’t an insoluble problem. Every day we see more examples of companies successfully using the cloud at scale, and staying secure and compliant. Today they largely build their own libraries of tools and scripts to continually monitor and enforce changes. We also see some emerging tools to help with this management, and expect to see many more in the near future.
A core developing concept tied to automation is immutable security, and we have used it ourselves.
One of the core problems in security is managing change. We design something, build in security, deploy it, validate that security, and lock everything down. This inevitably drifts as it’s patched, updated, improved, and otherwise modified. Immutable security leverages automation, DevOps techniques, and inherent cloud characteristics to break this cycle. To be honest, it’s really just DevOps applied to security, and all the principles are in wide use already.
For example an immutable server is one that is never logged into or changed in production. If you go back to about auto scaling, we deploy servers based on standard images. Changing one of those servers after deployment doesn’t make sense, because those changes wouldn’t be in the image, so new versions launched by auto scaling wouldn’t include them. Instead DevOps creates a new image with all the changes, then alters the auto scale group rules to deploy new instances based on the new image, and man optionally prune off the older versions.
In other words no more patching, and no more logging into servers. You take a new known-good state, and completely replace what is in production.
Think about how this applies to network security. We can build templates to automatically deploy entire environments at our cloud providers. We can write network security policies, then override any changes automatically, even across multiple cloud accounts. This pushes the security effort earlier into design and development, and enables much more consistent enforcement in operations. And we use the exact same toolchain as Development and Operations to deploy our security controls, rather than trying to build our own on the side and overlay enforcement afterwards.
This might seem like an aside, but these automation principles are the cornerstone of real-world cloud security, especially at scale. This is a capability we never have in traditional infrastructure, where we cannot simply stamp out new environments automatically, and need to hand-configure everything.
Posted at Monday 28th September 2015 6:57 pm
(1) Comments •