Container Security 2018: Runtime Security Controls
After the focus on tools and processes in previous sections, we can now focus on containers in production systems. This includes which images are moved into production repositories, selecting and running containers, and the security of underlying host systems. Runtime Security The Control Plane: Our first order of business is ensuring the security of the control plane: tools for managing host operating systems, the scheduler, the container client, engine(s), the repository, and any additional deployment tools. As we advised for container build environment security, we recommend limiting access to specific administrative accounts: one with responsibility for operating and orchestrating containers, and another for system administration (including patching and configuration management). On-premise we recommend network and physical segregation, and for cloud and virtual systems we prefer logical segregation. The good news is that several third-party tools offer full identity and access management, LDAP/AD integration, and token-based SSO (i.e.: SAML) across systems. Resource Usage Analysis: Many readers are familiar with this for performance, but it can also offer insight into basic code security. Does the container allow port 22 (administration) access? Does the container try to update itself? What external systems and utilities does it depend upon? Any external resource usage is a potential attack point for attackers, so it’s good hygiene to limit ingress and egress points. To manage the scope of what containers can access, third-party tools can monitor runtime access to environment resources – both inside and outside the container. Usage analysis is basically automated review of resource requirements. This is useful in a number of ways – especially for firms moving from a monolithic architecture to microservices. Analysis can help developers understand which references they can remove from their code, and help operations narrow down roles and access privileges. Selecting the Right Image: We recommend establishing a trusted image repository and ensuring that your production environment can only pull containers from that trusted source. Ad hoc container management makes it entirely too easy for engineers to bypass security controls, so we recommend establishing trusted central repositories for production images. We also recommend scripting deployment to avoid manual intervention, and to ensure the latest certified container is always selected. This means checking application signatures in your scripts before putting containers into production, avoiding manual verification overhead or delay. Trusted repository and registry services can help by rejecting containers which are not properly signed. Fortunately many options are available, so pick one you like. Keep in mind that if you build many containers each day, a manual process will quickly break down. It is okay to have more than one image repository – if you are running across multiple cloud environments there are advantages to leveraging the native registry in each one. Immutable Images: Developers often leave shell access to container images so they can log into containers running in production. Their motivation is often debugging and on-the-fly code changes, both bad for consistency and security. Immutable containers – which do not allow ssh connections – prevent interactive real-time manipulation. They force developers to fix code in the development pipeline, and remove a principal attack path. Attackers routinely scan for ssh access to take over containers, and leverage them to attack underlying hosts and other containers. We strongly suggest use of immutable containers without ‘port 22’ access, and making sure that all container changes take place (with logging) in the build process, rather than in production. Input Validation: At startup containers accept parameters, configuration files, credentials, JSON, and scripts. In more aggressive scenarios ‘agile’ teams shove new code segments into containers as input variables, making existing containers behave in fun new ways. Validate that all input data is suitable and complies with policy, either manually or using a third-party security tool. You must ensure that each container receives the correct user and group IDs to map to the assigned view at the host layer. This can prevent someone from forcing a container to misbehave, or simply prevent dumb developer mistakes. Blast Radius: The cloud enables you to run different containers under different cloud user accounts, limiting the resources available to any given container. If an account or container set is compromised, the same cloud service restrictions which prevent tenants from interfering with each other will limit damage between your different accounts and projects. For more information see our reference material on limiting blast radius with user accounts. Container Group Segmentation: One of the principal benefits of container management systems is help scaling tasks across pools of shared servers. Each management platform offers a modular architecture, with scaling performed on node/minion/slave sub-groups, which in turn include a set of containers. Each node forms its own logical subnet, limiting network access between sets of containers. This segregation limits ‘blast radius’ by restricting which resources any container can access. It is up to application architects and security teams to leverage this construct to improve security. You can enforce this with network policies on the container manager service, or network security controls provided by your cloud vendor. Over and above this orchestration manager feature, third-party container security tools – whether running as an agent inside containers, or as part of underlying operation systems – can provide a type of logical network segmentation which further limits network connections between groups of containers. All together this offers fine-grained isolation of containers and container groups from each another. Platform Security Until recently, when someone talked about container security, they were really talking about how to secure the hypervisor and underlying operating system. So most articles and presentations on container security focuses on this single – admittedly important – facet. But we believe runtime security needs to encompass more than that, and we break the challenge into three areas: host OS hardening, isolation of namespaces, and segregation of workloads by trust level. Host OS/Kernel Hardening: Hardening is how we protect a host operating system from attacks and misuse. It typically starts with selection of a hardened variant of the operating system you will use. But while these versions