Resilient Cloud Network Architectures: Fundamentals
As much as we like to believe we have evolved as a species, people continue to be scared of things they don’t understand. Yes, many organizations have embraced the cloud whole hog and are rushing headlong into the cloud age. But it’s a big world, and millions of others remain paralyzed – not really understanding cloud computing, and taking the general approach that it can’t be secure because, well, it just can’t. Or it’s too new. Or some for other unfounded and incorrect reason. Kind of like when folks insisted that the Earth was the center of the universe. This blog series builds on our recent Pragmatic Security for Cloud and Hybrid Networks paper, focusing on cloud-native network architectures that provide security and availability in ways you cannot accomplish in a traditional data center. This evolution will take place over the next decade, and organizations will need to support hybrid networks for some time. But for those ready, willing, and able to step forward into the future today, the cloud is waiting to break the traditional rules of how technology has been developed, deployed, scaled, and managed. We have been aggressive in proselytizing our belief that the move towards the cloud is the single biggest disruption in technology for the next few decades. Yes, even bigger than the move from mainframes to client/server (we’re old – we know). So our Resilient Cloud Network Architectures series will provide the basics of cloud network security, with a few design patterns to illustrate. We would like to thank Resilient Systems for provisionally agreeing to license the content in this paper. As always, we’ll build the content using our Totally Transparent Research methodology, mean we will post everything to the blog first, and allow you (our readers) to poke holes in it. Once it has been sufficiently prodded, we will publish a paper for your reference. Defining Resilient If we bust out the old dictionary to define resilient, we get: able to become strong, healthy, or successful again after something bad happens able to return to an original shape after being pulled, stretched, pressed, bent, etc. In the context of computing, you want to deploy technology that can not just become strong again, but resist attack in the first place. Recoverability is also key: if something bad happens you want to return service quickly, if it causes an outage at all. For network architecture we always fall back on the cloud computing credo: Design for failure. A resilient network architecture both makes it harder to compromise an application and minimizes downtime in case of an issue. Key aspects of cloud computing which provide security and availability include: Network Isolation: Using the inherent ability of the cloud to restrict connections (via software firewalls, which are called security groups and described below), you can build a network architecture that fully isolates the different tiers of an application stack. That prevents a compromise in one application (or database) from leaking or attacking information stored in another. Account Isolation: Another important feature of the cloud is the ability to use multiple accounts per application. Each of your different environments (Dev, Test, Production, Logging, etc.) can use different accounts, which provides valuable isolation because you cannot access cloud infrastructure across accounts without explicit authorization. Immutability: An immutable server is one that is never logged into or changed in production. In cloud-native DevOps environments servers are deployed in auto-scale groups based on standard images. This prevents human error and configuration drift from creating exploitation paths. You take a new known-good state, and completely replace older images in production. No more patching and no more logging into servers. Regions: You could build multiple data centers around the world to provide redundancy. But that’s not a cheap option, and rarely feasible. To do the same thing in the cloud, you basically just replicate an entire environment in a different region via an API call or a couple clicks in a cloud console. Regions are available all over the world, with multiple availability zones within each, to further minimize single points of failure. You can load balance between zones and regions, leveraging auto-scaling to keep your infrastructure running the same images in real time. We will explain this design pattern in our next post. The key takeaway is that cloud computing provides architectural options which are either impossible or economically infeasible in a traditional data center, to provide greater protection and availability. This series we will describe the fundamentals of cloud networking for context, and then dig into design patterns which provide both security and availability – which we define as resilience. Understanding Cloud Networks The key difference between a network in your data center and one in the cloud is that cloud customers never access the ‘real’ network or hardware. Cloud computing uses virtual networks to abstract the networks you see and manage from the (invisible) underlying physical resources. When your server gets IP address 10.0.1.12, that IP address does not exist on routing hardware – it’s a virtual address on a virtual network. Everything is handled in software. Cloud networking varies across cloud providers, but differs from traditional networks in visibility, management, and velocity of change. You cannot tap into a cloud provider’s virtual network, so you’ll need to think differently to monitor your networks. Additionally, cloud networks are typically managed via scripts or programs, making Application Programming Interfaces (API) calls, rather than a graphical console or command line. That enables developers to do pretty much anything, including standing up networks and reconfiguring them – instantly via code. Finally, cloud networks change much faster than physical networks because cloud environments change faster, including spinning up and shutting down servers via automation. So traditional workflows to govern network change don’t really map to your cloud network. It can be confusing because cloud networks look like traditional networks, with their own routing tables and firewalls. But looks are deceiving – although familiar constructs have been carried over, there are fundamental differences. Cloud Network Architectures In order to choose the right solution to address your requirements, you need to understand the types of cloud network