Monitoring the Hybrid Cloud: Solution ArchitecturesBy Adrian Lane
The good old days: Monitoring employees on company-owned PCs, accessing the company data center across corporate networks. You knew where everything was, and who was using it. And the company owned it all, so you could pretty much dictate where and how you performed security monitoring. With cloud and mobile? Not so much.
To take advantage of cloud computing you will need to embrace new approaches to collecting event data if you hope to continue security monitoring. The sources, and the information they contain, are different. Equally important – although initially more subtle – is how to deploy monitoring services. Deployment architectures are critical to deploying and scaling any Security Operations Center; defining how you manage security monitoring infrastructure and what event data you can capture. Furthermore, how you deploy the SOC platform impacts performance and data management. There are a variety of different architectures, intended to meet the use cases outlined in our last post. So now we can focus on alternative ways to deploy collectors in the cloud, and the possibility of using a cloud security gateway as a monitoring point. Then we will take a look at the basic cloud deployment models for a SOC architected to monitor the hybrid cloud, focusing on how to manage pools of event data coming from distributed environments – both inside and outside the organization.
Data collection strategies
API: Automated, elastic, and self-service are all intrinsic characteristics for cloud computing. Most cloud service providers offer a management dashboard for convenience (and unsophisticated users), but advanced cloud features are typically exposed only via scripts and programs. Application Programming Interfaces (APIs) are the primary interfaces to cloud services; they are essential for configuring a cloud environment, configuring and activating monitoring, and gathering data. These APIs can be called from any program or service, running either on-premise or within a cloud environment. So APIs are the cloud equivalent to platform agents, providing many of the same capabilities in the cloud where a ‘platform’ becomes a virtualized abstraction and a traditional agent wouldn’t really work. API calls return data in a variety of ways, including the familiar
syslogformat, JSON files, and even various formats specific to different cloud providers. Regardless, aggregating data returned by API calls is a new key source of information for monitoring hybrid clouds.
Cloud Gateways: Hybrid cloud monitoring often hinges on a gateway – typically an appliance deployed at the ‘edge’ of the network to collect events. Leveraging the existing infrastructure for data management and SOC interfaces, this approach requires all cloud usage to first be authenticated to the cloud gateway as a choke point; after inspection, traffic is passed on to the appropriate cloud service. The resulting events are then passed to event collection services, comparable to on-premise infrastructure. This enables tight integration with existing security operations and monitoring platforms, and the initial authentication allows all resource requests to be tied to specific user credentials.
Cloud 2 Cloud: A newer option is to have one cloud service – in this case a monitoring service – act as a proxy to another cloud service; tapping into user requests and parsing out relevant data, metadata, and application calls. Similarly to using a managed service for email security, traffic passes through a cloud provider to parse incoming requests before they are forwarded to internal or cloud applications. This model can incorporate mobile devices and events – which otherwise never touch on-premise networks – by passing their traffic through an inspection point before they reach cloud service providers such as Salesforce and Microsoft Azure. This enables the SOC to provide real-time event analysis and alert on policy violations, with collected events forwarded to the SOC (either on-premise or in the cloud) for storage. In some cases by proxying traffic these services can also add additional security – such as checks against on-premise identity stores, to ensure employees are still employed before granting access to cloud resources.
App Telemetry: Like cloud providers, mobile carriers, mobile OS providers, and handset manufacturers don’t provide much in the way of logging capabilities. Mobile platforms are intended to be secured from outsiders and not leak information between apps. But we are beginning to see mobile apps developed specifically for corporate use, as well as company-specific mobile app containers on devices, which send basic telemetry back to the corporate customer to provide visibility into device activity. Some telemetry feeds include basic data about the device, such as jailbreak detection, while others append user ‘fingerprints’ to authorize requests for remote application access. These capabilities are compiled into individual mobile apps or embedded into app containers which protect corporate apps and data. This capability is very new, and will eventually help to detect fraud and misuse on mobile endpoints.
Agents: You are highly unlikely to deploy agentry in SaaS or PaaS clouds; but there are cases where agents have an important role to play in hybrid clouds, private clouds, and Infrastructure as a Service (IaaS) clouds – generally when you control the infrastructure. Because network architecture is virtualized in most clouds, agents offer a way to collect events and configuration information when traditional visibility and taps are unavailable. Agents also call out to cloud APIs to check application deployment.
Supplementary Services: Cloud SOCs often rely on third-party intelligence feeds to correlate hostile acts or actors attacking other customers, helping you identify and block attempts to abuse your systems. These are almost always cloud-based services that provide intelligence, malware analysis, or policies based on a broader analysis of data from a broad range of sites and data in order to detect unwanted behavior patterns. This type of threat intelligence supplements hybrid SOCs and helps organizations detect potential attacks faster, but it is not itself a SOC platform. You can refer to our other threat intelligence papers to dig deeper into this topic. (link to threat intel research)
The following are all common ways to deploy event collectors, monitoring systems, and operations centers to support security monitoring:
On-premise: We will forgo a detailed explanation of on-premise SOCs because most of you are already familiar with this model, and we have written extensively on this topic. In general, the infrastructure that provides the ability to monitor a hybrid cloud remains the same. The most significant change is the inclusion of data from remote cloud, mobile events, and configuration data, along with monitoring policies designed to digest remote events. Be prepared for significant change – cloud and mobile event data formats vary, and typically include slightly different information from one source to the next. Remember all your work a decade ago to get connectors to properly parse security event data? You will be doing that again until a standard format emerges. You will also need a new round of tuning detection rules – naively acceptable activities for internal users and systems can be malicious, especially coming from remote locations or cloud services.
Hybrid: A hybrid SOC is any deployment model where some analysis is done in-house and some is performed remotely in the cloud. The remote portion could be offloaded to a monitoring service vendor, as described under “Cloud 2 Cloud” above, or perhaps preliminary “Level One” analysis is performed by the managed services team, with advanced analysis forensics handled by internal resources. Here you continue to run and operate the existing SIEM with all its event collectors, and send a subset of events to an external provider for the heavy lifting of event analysis and forensics. Alternatively you could use an external provider to directly aggregate and analyze remote/cloud activity, and send filtered alerts to the on-premise SOC. A hybrid SOC increases agility for addressing new challenges, while leveraging in-house investments and expertise, though of course there is a cost for maintaining both internal and external monitoring capabilities.
Exclusively Cloud: It is still rare but definitely possible to push all data from both on-premise and cloud services up to a third party for full remote SOC services. This entails the remote SOC providing all data management, analysis, policy development, and retention. On-premise events are fed through a gateway to the cloud service; the gateway provides some filtering, compression, and security to protect event data.
Third Party Management: Many large enterprises run security operations in-house, with a team of employees monitoring systems for attack and forensically analyzing suspicious alerts. But not every firm has a sophisticated and capable security team in-house to do the difficult and expensive work of writing policies and security analysis. So it is attractive (and increasingly common) to offload the difficult analysis problems to others, keeping only a portion of this role in-house. You have some flexibility in how to engage with a service provider. One approach is to have them take control of your on-premise monitoring systems. Alternatively, the third party can supplement what you have by handling just external cloud monitoring. Finally, in some cases the entire SOC is pushed to the third party for operations and management.
Our next post will sketch out what you really need to know to decide how to proceed: the Gotchas. We will run through the problem areas and tradeoffs you need to consider before selecting from the data collection and deployment options summarized above. We will dig into problems of scalability, cost, data security, privacy, and even some data governance issues that can make deciding between solutions more difficult.