Login  |  Register  |  Contact
Monday, June 06, 2016

Building a Vendor (IT) Risk Management Program [New Paper]

By Mike Rothman

In Building a Vendor (IT) Risk Management Program, we explain why you can no longer ignore the risk presented by third-party vendors and other business partners, including managing an expanded attack surface and new regulations demanding effective management of vendor risk. We then offer ideas for how to build a structured and systematic program to assess vendor (IT) risk, and take action when necessary.

VRM Cover

We would like to thank BitSight Technologies for licensing the content in this paper. Our unique Totally Transparent Research model allows us to perform objective and useful research without requiring paywalls or other such nonsense, which make it hard for the people who need our research to get it. A day doesn’t go by where we aren’t thankful to all the companies who license our research.

You can get the paper from the landing page in our research library.

—Mike Rothman

Friday, June 03, 2016

Evolving Encryption Key Management Best Practices: Part 2

By Rich

This is the second in a four-part series on evolving encryption key management best practices. The first post is available here. This research is also posted at GitHub for public review and feedback. My thanks to Hewlett Packard Enterprise for licensing this research, in accordance with our strict Totally Transparent Research policy, which enables us to release our independent and objective research for free.

Best Practices

If there is one thread tying together all the current trends influencing data centers and how we build applications, it’s distribution. We have greater demand for encryption in more locations in our application stacks – which now span physical environments, virtual environments, and increasing barriers even within our traditional environments.

Some of the best practices we will highlight have long been familiar to anyone responsible for enterprise encryption. Separation of duties, key rotation, and meeting compliance requirements have been on the checklist for a long time. Others are familiar, but have new importance thanks changes occurring in data centers. Providing key management as a service, and dispersing and integrating into required architectures aren’t technically new, but they are in much greater demand than before. Then there are the practices which might not make the list, such as supporting APIs and distributed architectures (potentially spanning physical and virtual appliances).

As you will see, the name of the game is consolidation for consistency and control; simultaneous with distribution to support diverse encryption needs, architectures, and project requirements.

But before we jump into recommendations, keep our focus in mind. This research is for enterprise data centers, including virtualization and cloud computing. There are plenty of other encryption use cases out there which don’t necessarily require everything we discuss, although you can likely still pick up a few good ideas.

Build a key management service

Supporting multiple projects with different needs can easily result in a bunch of key management silos using different tools and technologies, which become difficult to support. One for application data, another for databases, another for backup tapes, another for SANs, and possibly even multiple deployments for the same functions, as individual teams pick and choose their own preferred technologies. This is especially true in the project-based agile world of the cloud, microservices, and containers. There’s nothing inherently wrong with these silos, assuming they are all properly managed, but that is unfortunately rare. And overlapping technologies often increase costs.

Overall we tend to recommend building centralized security services to support the organization, and this definitely applies to encryption. Let a smaller team of security and product pros manage what they are best at and support everyone else, rather than merely issuing policy requirements that slow down projects or drive them underground.

For this to work the central service needs to be agile and responsive, ideally with internal Service Level Agreements to keep everyone accountable. Projects request encryption support; the team managing the central service determines the best way to integrate, and to meet security and compliance requirements; then they provide access and technical support to make it happen.

This enables you to consolidate and better manage key management tools, while maintaining security and compliance requirements such as audit and separation of duties. Whatever tool(s) you select clearly need to support your various distributed requirements. The last thing you want to do is centralize but establish processes, tools, and requirements that interfere with projects meeting their own goals.

And don’t focus so exclusively on new projects and technologies that you forget about what’s already in place. Our advice isn’t merely for projects based on microservices containers, and the cloud – it applies equally for backup tapes and SAN encryption.

Centralize but disperse, and support distributed needs

Once you establish a centralized service you need to support distributed access. There are two primary approaches, but we only recommend one for most organizations:

  • Allow access from anywhere. In this model you position the key manager in a location accessible from wherever it might be needed. Typically organizations select this option when they want to only maintain a single key manager (or cluster). It was common in traditional data centers, but isn’t well-suited for the kinds of situations we increasingly see today.
  • Distributed architecture. In this model you maintain a core “root of trust” key manager (which can, again, be a cluster), but then you position distributed key managers which tie back to the central service. These can be a mix of physical and virtual appliances or servers. Typically they only hold the keys for the local application, device, etc. that needs them (especially when using virtual appliances or software on a shared service). Rather than connecting back to complete every key operation, the local key manager handles those while synchronizing keys and configuration back to the central root of trust.

Why distribute key managers which still need a connection back home? Because they enable you to support greater local administrative control and meet local performance requirements. This architecture also keeps applications and services up and running in case of a network outage or other problem accessing the central service. This model provides an excellent balance between security and performance.

For example you could support a virtual appliance in a cloud project, physical appliances in backup data centers, and backup keys used within your cloud provider with their built-in encryption service.

This way you can also support different technologies for distributed projects. The local key manager doesn’t necessarily need to be the exact same product as the central one, so long as they can communicate and both meet your security and compliance requirements. We have seen architectures where the central service is a cluster of Hardware Security Modules (appliances with key management features) supporting a distributed set of HSMs, virtual appliances, and even custom software.

The biggest potential obstacle is providing safe, secure access back to the core. Architecturally you can usually manage this with some bastion systems to support key exchange, without opening the core to the Internet. There may still be use cases where you cannot tie everything together, but that should be your last option.

Be flexible: use the right tool for the right job

Building on our previous recommendation, you don’t need to force every project to use a single tool. One of the great things about key management is that modern systems support a number of standards for intercommunication. And when you get down to it, an encryption key is merely a chunk of text – not even a very large one.

With encryption systems, keys and the encryption engine don’t need to be the same product. Even your remote key manager doesn’t need to be the same as the central service if you need something different for that particular project.

We have seen large encryption projects fail because they tried to shoehorn everything into a single monolithic stack. You can increase your chances for success by allowing some flexibility in remote tools, so long as they meet your security requirements. This is especially true for the encryption engines that perform actual crypto operations.

Provide APIs, SDKs, and toolkits

Even off-the-shelf encryption engines sometimes ship with less than ideal defaults, and can easily be used incorrectly. Building a key management service isn’t merely creating a central key manager – you also need to provide hooks to support projects, along with processes and guidance to ensure they are able to get up and running quickly and securely.

  • Application Programming Interfaces: Most key management tools already support APIs, and this should be a selection requirement. Make sure you support RESTful APIs, which are particularly ubiquitous in the cloud and containers. SOAP APIs are considered burdensome these days.
  • Software Development Kits: SDKs are pre-built code modules that allow rapid integration into custom applications. Provide SDKs for common programming languages compatible with your key management service/products. If possible you can even pre-configure them to meet your encryption requirements and integrate with your service.
  • Toolkits: A toolkit includes all the technical pieces a team needs to get started. It can include SDKs, preconfigured software agents, configuration files, and anything else a project might need to integrate encryption into anyything from a new application to an old tape backup system.

Provide templates and recommendations, not just standards and requirements

All too often security sends out requirements, but fails to provide specific instructions for meeting those requirements. One of the advantages of standardization around a smaller set of tools is that you can provide detailed recommendations, instructions, and templates to satisfy requirements.

The more detail you can provide the better. We recommend literally creating instructional documents for how to use all approved tools, likely with screenshots, to meet encryption needs and integrate with your key management service. Make them easily available, perhaps through code repositories to better support application developers. On the operations side, include them not only for programming and APIs, but for software agents and integration into supported storage repositories and backup systems.

If a project comes up which doesn’t fit any existing toolkit or recommendations, build them with that project team and add the new guidance to your central repository. This dramatically speeds up encryption initiatives for existing and new platforms.

Meet core security requirements

So far we have focused on newer requirements to meet evolving data center architectures, the impact of the cloud, and new application design patterns; but all the old key management practices still apply:

  • Enforce separation of duties: Implement multiple levels of administrators. Ideally require dual authorities for operations directly impacting key security and other major administrative functions.
  • Support key rotation: Ideally key rotation shouldn’t create downtime. This typically requires both support in the key manager and configuration within encryption engines and agents.
  • Enable usage logs for audit, including purpose codes: Logs may be required for compliance, but are also key for security. Purpose codes tell you why a key was requested, not just by who or when.
  • Support standards: Whatever you use for key management must support both major encryption standards and key exchange/management standards. Don’t rely on fully proprietary systems that will overly limit your choices.
  • Understand the role of FIPS and its different flavors, and ensure you meet your requirements: FIPS 140-2 is the most commonly accepted standard for cryptographic modules and systems. Many products advertise FIPS compliance (which is often a requirement for other compliance, such as PCI). But FIPS is a graded standard with different levels ranging from a software module, to plugin cards, to a fully tamper-resistant dedicated appliance. Understand your FIPS requirements, and if you evaluate a “FIPS certified” ‘appliance’, don’t assume the entire appliance is certified – it might be only the software, not the whole system. You may not always need the highest level of assurance, but start by understanding your requirements, and then ensure your tool actually meets them.

There are many more technical best practices beyond the scope of this research, but the core advice that might differ from what you have seen in the past is:

  • Provide key management as a service to meet diverse encryption needs.
  • Be able to support distributed architectures and a range of use cases.
  • Be flexible on tool choice, then provide technical components and clear guidance on how to properly use tools and integrate them into your key management program.
  • Don’t neglect core security requirements.

In our next section we will start looking at specific use cases, some of which we have already hinted at.

—Rich

Summary: June 3, 2016

By Adrian Lane

Adrian here.

Unlike my business partners, who have been logging thousands of air miles, speaking at conferences and with clients around the country, I have been at home. And the mildest spring in Phoenix’s recorded history has been a blessing, as we’re 45 days past the point 100F days typically start. Bike rides. Hiking. Running. That is, when I get a chance to sneak outdoors and enjoy it. With our pivot there is even more writing and research going on than normal, which I wasn’t sure was possible. You will begin to see the results of this work within the next couple weeks, and we look forward to putting a fresh face on our business. That launch will coincide with us posting lots more hands-on advice for cloud security and migrations.

And as a heads-up, I will be talking big data security over at SC Magazine on the 20th. I’ll tweet out a link at @AdrianLane next week if you’re interested.

You can subscribe to only the Friday Summary.

Top Posts for the Week

Tool of the Week

“Server-less computing? What do you mean?” Rich and I were discussing cloud deployment options with one of the smartest engineering managers I know, and he was totally unaware of serverless cloud computing architectures. If he was unaware of this capability, lots of other people probably are as well. So this week’s Tool of the Week section will discuss not a single tool, but instead a functional paradigm offered by multiple cloud vendors. What are they? Google’s GCP page best captures the idea: essentially a “lightweight, event-based, asynchronous solution that allows you to create small, single-purpose functions that respond to Cloud events without the need to manage a server or a runtime environment” What Google does not mention there is that these functions tend to be very fast, and you can run multiple copies in parallel to scale capacity.

It really embodies microservices. You can construct an entire application from these functions. For example take a stream of data and run it through a series of functions to process it. It could be audio or image files’ or real-time event data inspection, transformation, enrichment, comparison… or any combination you can think of. The best part? There is no server. There is no OS to set up. No CPU or disk capacity to specify. No configuration files. No network ports to manage. It’s simply a logical function running out there in the ‘ether’ of your public cloud.

Google calls its version on GCP Cloud Functions. Amazon’s version on AWS is called (Lambda functions](http://docs.aws.amazon.com/lambda/latest/dg/welcome.html). Microsoft calls the version on Azure simply Functions. Check out their API documentation – they all work slightly differently, and some have specific storage requirements to act as endpoints, but the concept is the same. And the pricing for these services is pretty low – with Lambda, for example, the first million requests are free, and Amazon charges 20 cents per million requests after that.

This feature is one of the many reasons we tell companies to reconsider their application architectures when moving to cloud services. We’ll post some tidbits on security for these services in the future. For now, check them out!

Securosis Blog Posts this Week

Training and Events

—Adrian Lane

Thursday, June 02, 2016

Incident Response in the Cloud Age: In Action

By Mike Rothman

When we do a process-centric research project, it works best to wrap up with a scenario to illuminate the concepts we discuss through the series, and make things a bit more tangible.

In this situation imagine you work for a mid-sized retailer which uses a mixture of in-house technology and SaaS, and has recently moved a key warehousing system to an IaaS provider as part of rebuilding the application for cloud computing. You have a modest security team of 10, which is not enough, but a bit more than many of your peers. Senior management understands why security is important (to a point) and gives you decent leeway, especially regarding the new IaaS application. In fact you were consulted during the IaaS architecture phase and provided some guidance (with some help from your friends at Securosis) on building a resilient cloud network architecture, and how to secure the cloud control plane. You also had an opportunity to integrate some orchestration and automation technology into the new cloud technology stack.

The Trigger

You have your team on fairly high alert, because a number of your competitors have recently been targeted by an organized crime ring which has gained a foothold among your competitors; and proceeded to steal a ton of information about customers, pricing, and merchandising strategies. This isn’t your first rodeo, so you know that when there is smoke there is usually fire, and you decide to task one of your more talented security admins with a little proactive hunting in your environment. Just to make sure nothing bad is going on.

The admin starts poking around, searching internal security data with some of the more recent malware samples found in the attacks on the other retailers. The samples were provided by your industry’s ISAC (Information Sharing and Analysis Center). The analyst got a hit on one of the samples, confirming your concern. You have an active adversary on your network. So now you need to engage your incident response process.

Job 1: Initial Triage

Once you know there is a situation you assemble the response team. There aren’t that many of you, and half the team needs to pay attention to ongoing operational tasks, because taking down systems wouldn’t make you popular with senior management or investors. You also don’t want to jump the gun until you know what you’re dealing with, so you inform the senior team of the situation, but don’t take any systems down. Yet.

The adversary is active on your internal network, so they most likely entered via phishing or another social engineering attack. Searches found indications of the malware on 5 devices, so you take those devices off the network immediately. Not shut down, but put on a separate network with Internet access to avoid tipping off the adversary to their discovery.

Then you check your network forensics tool, looking for indications that data has been leaking. There are a few suspicious file transfers, but luckily you integrated your firewall egress filtering capability with your forensics tool. So once the firewall showed anomalous traffic being sent to known bad sites (via a threat intelligence integration on the firewall), you automatically started capturing network traffic from the devices which triggered the alert. Automation is sure easier that doing everything manually.

As part of your initial triage you got endpoint telemetry alerting you to issues, and network forensics data for a clue to what’s leaking. This is enough to know you not only have an active adversary, but that more than likely you lost data. So you fire up your case management system to structure your investigation and store all the artifacts of your investigation.

Your team is tasked with specific responsibilities, and sent on their way to get things done. You make the trek to the executive floor to keep senior management updated on the incident.

Check the Cloud

The attack seems to have started on your internal network, but you don’t want to take chances, and you need to make sure the new cloud-based application isn’t at risk. A quick check of the cloud console shows strange activity on one of your instances. A device within the presentation layer of the cloud stack was flagged by your IaaS provider’s monitoring system because there was an unauthorized change on that specific instance. It looks like the time you spent setting up that configuration monitoring service was well spent.

Security was involved in architecting the cloud stack, so you are in good shape. The application was built to be isolated. Even though it appears the presentation layer has been compromised, adversaries shouldn’t be able to reach anything of value. And the clean-up has already happened. Once the IaaS monitoring system threw an alert, that instance was taken offline and put into a special security group accessible only by investigators. A forensic server was spun up, and some additional analysis was performed. Orchestration and automation facilitating incident response again.

The presentation layer has large variances in how much traffic it needs to handle, so it was built using auto-scaling technology and immutable servers. Once the (potentially) compromised instance was removed from the group, another instance with a clean configuration was spun up and to share workload. But it’s not clear whether this attack is related to the other incident, so you take the information about the cloud attack, pull it down, and feed it into your case management system. But the reality is that this attack, even if related, doesn’t present a danger at this point, so it’s put to the side while you focus on the internal attack and probable exfiltration.

Building the Timeline

Now that you have completed initial triage, it’s time to dig into the attack and start building a timeline of what happened. You start by looking at the comprised endpoints and network metadata to see what the adversaries did. From examining endpoint telemetry you deduced that Patient Zero was a contractor on the Human Resources (HR) team. This individual was tasked with looking at resumes submitted to the main HR email account, and initial qualification screening for an open position. The resume was a compromised Word file using a pretty old Windows 7 attack. It turns out the contractor was using their own machine, which hadn’t been patched and was vulnerable. You can’t be that irritated with the contractor – it was their job to open those files. The malware rooted the device, connected up to a botnet, and then installed a Remote Access Trojan (RAT) to allow the adversary to take control of the device and start a systematic attack against the rest of your infrastructure.

You ponder how your organization’s BYOD policy enables contractors to use their own machines. The operational process failure was in not inspecting the machine on connection to the network; you didn’t make sure it was patched, or running an authorized configuration. That’s something to scrutinize as part of the post-mortem.

Once the adversary had presence on your network, they proceeded to compromise another 4 devices, ultimately ending up on both the CFO’s and the VP of Merchandising’s devices. Network forensic metadata shows how they moved laterally within the network, taking advantage of weak segmentation between internal networks. There are only so many hours in the day, and the focus had been on making sure the perimeter was strong and monitoring ingress traffic.

Once you know the CFO’s and VP of Merchandising’s devices were compromised, you can clearly see exfiltration in network metadata. A quick comparison of file sizes in data captured once the egress filter triggered shows that they probably got the latest quarterly board report, as well as a package of merchandising comps and plans for an exclusive launch with a very hot new fashion company. It was a bit of a surprise that the adversary didn’t bother encrypting the stolen data, but evidently they bet that a mid-sized retailer wouldn’t have sophisticated DLP or egress content filtering. Maybe they just didn’t care whether anyone found out what was exfiltrated after the fact, or perhaps they were in a hurry and wanted the data more than to remain undiscovered.

You pat yourself on the back, once, that your mature security program included an egress filter triggered a full packet capture of outbound traffic from all the compromised devices. So you know exactly what was taken, when, and where it went. That will be useful later, when talking to law enforcement and possibly prosecuting at some point, but right now that’s little consolation.

Cleaning up the Mess

Now that you have an incident timeline, it’s time to clean up and return your environment to a good state. The first step is to clean up the affected machines. Executives are cranky because you decided to reimage their machines, but your adversary worked to maintain persistence on compromised devices in other attacks, so prudence demands you wipe them.

The information on this incident will need to be aggregated, then packaged up for law enforcement and the general counsel, in preparation for the unavoidable public disclosure. You take another note that the team should consider using a case management system to track incident activity, provide a place to store case artifacts, and ensure proper chain of custody. Given your smaller team, that should help smooth your next incident response.

Finally, this incident was discovered by a savvy admin hunting across your networks. So to complete the active part of this investigation, you task the same admin with hunting back through the environment to make sure this attack has been fully eradicated, and no similar attacks are in process. Given the size of your team, it’s a significant choice to devote resources to hunting, but given the results, this is an activity you will need to perform on a monthly cadence.

Closing the Loop

To finalize this incident, you hold a post-mortem with the extended team, including representatives from the general counsel’s office. The threat intelligence being used needs to be revisited and scrutinized, because the adversary connected to a botnet but wasn’t detected. And the rules on your egress filters have been tightened because if the exfiltrated data had been encrypted, your response would have been much more complicated. The post-mortem also provided a great opportunity to reinforce the importance of having security involved in application architecture, given how well the new IaaS application stood up under attack.

Another reminder that sometimes a skilled admin who can follow their instincts is the best defense. Tools in place helped accelerate response and root cause identification, and made remediation more effective. But Incident Response in the Cloud Age involves both people and technology, along with internal and external data, to ensure effective and efficient investigation and successful remediation.

—Mike Rothman

Tuesday, May 31, 2016

Understanding and Selecting RASP: Integration

By Adrian Lane

This post will offer examples for how to integrate RASP into a development pipeline. We’ll cover both how RASP fits into the technology stack, and development processes used to deliver applications. We will close this post with a detailed discussion of how RASP differs from other security technologies, and discuss advantages and tradeoffs compared to other security technologies.

As we mentioned in our introduction, our research into DevOps produced many questions on how RASP worked, and whether it is an effective security technology. The questions came from non-traditional buyers of security products: application developers and product managers. Their teams, by and large, were running Agile development processes. The majority were leveraging automation to provide Continuous Integration – essentially rebuilding and retesting the application repeatedly and automatically as new code was checked in. Some had gone as far as Continuous Deployment (CD) and DevOps. To address this development-centric perspective, we offer the diagram below to illustrate a modern Continuous Deployment / DevOps application build environment. Consider each arrow a script automating some portion of source code control, building, packaging, testing, or deployment of an application.

CI Pipeline

Security tools that fit this model are actively being sought by development teams. They need granular API access to functions, quick production of test results, and delivery of status back to supporting services.

Application Integration

  • Installation: As we mentioned back in the technology overview, RASP products differ in how they embed within applications. They all offer APIs to script configuration and runtime policies, but how and where they fit in differ slightly between products. Servlet filters, plugins, and library replacement are performed as the application stack is assembled. These approaches augment an application or application ‘stack’ to perform detection and blocking. Virtualization and JVM replacement approaches augment run-time environments, modifying the subsystems that run your application modified to handle monitoring and detection. In all cases these, be it on-premise or as a cloud service, the process of installing RASP is pretty much identical to the build or deployment sequence you currently use.
  • Rules & Policies: We found the majority of RASP offerings include canned rules to detect or block most known attacks. Typically this blacklist of attack profiles maps closely to the OWASP Top Ten application vulnerability classes. Protection against common variants of standard attacks, such as SQL injection and session mis-management, is included. Once these rules are installed they are immediately enforced. You can enable or disable individual rules as you see fit. Some vendors offer specific packages for critical attacks, mapped to specific CVEs such as Heartbleed. Bundles for specific threats, rather than by generic attack classes, help security and risk teams demonstrate policy compliance, and make it easier to understand which threats have been addressed. But when shopping for RASP technologies you need to evaluate the provided rules carefully. There are many ways to attack a site with SQL injection, and many to detect and block such attacks, so you need to verify the included rules cover most of the known attack variants you are concerned with. You will also want to verify that you can augment or add rules as you see fit – rule management is a challenge for most security products, and RASP is no different.
  • Learning the application: Not all RASP technologies can learn how an application behaves, or offer whitelisting of application behaviors. Those that do vary greatly in how they function. Some behave like their WAF cousins, and need time to learn each application – whether by watching normal traffic over time, or by generating their own traffic to ‘crawl’ each application in a non-production environment. Some function similarly to white-box scanners, using application source to learn.
  • Coverage capabilities: During our research we found uneven RASP coverage of common platforms. Some started with Java or .Net, and are iterating to cover Python, Ruby, Node.js, and others. Your search for RASP technologies may be strongly influenced by available platform support. We find that more and more, applications are built as collections of microservices across distributed architectures. Application developers mix and match languages, choosing what works best in different scenarios. If your application is built on Java you’ll have no trouble finding RASP technology to meet your needs. But for mixed environments you will need to carefully evaluate each product’s platform coverage.

Development Process Integration

Software development teams leverage many different tools to promote security within their overarching application development and delivery processes. The graphic below illustrates the major phases teams go through. The callouts map the common types of security tests at specific phases within an Agile, CI, and DevOps frameworks. Keep in mind that it is still early days for automated deployment and DevOps. Many security tools were built before rapid and automated deployment existed or were well known. Older products are typically too slow, some cannot focus their tests on new code, and others do not offer API support. So orchestration of security tools – basically what works where – is far from settled territory. The time each type of test takes to run, and the type of result it returns, drives where it fits best into the phases below.

Security Tool Chain

RASP is designed to be bundled into applications, so it is part of the application delivery process. RASP offers two distinct approaches to help tackle application security. The first is in the pre-release or pre-deployment phase, while the second is in production. Either way, deployment looks very similar. But usage can vary considerably depending on which is chosen.

  • Pre-release testing: This is exactly what it sounds like: RASP is used when the application is fully constructed and going through final tests prior to being launched. Here RASP can be deployed in several ways. It can be deployed to monitor only, using application tests and instrumenting runtime behavior to learn how to protect the application. Alternatively RASP can monitor while security tests are invoked in an attempt to break the application, with RASP performing security analysis and transmitting its results. Development and Testing teams can learn whether RASP detected the tested attacks. Finally, RASP can be deployed in full blocking mode to see whether security tests were detected and blocked, and how they impacted the user experience. This provides an opportunity to change application code or augment the RASP rules before the application goes into production.
  • Production testing: Once an application is placed in a production environment, either before actual customers are using it (using Blue-Green deployment) or afterwards, RASP can be configured to block malicious application requests. Regardless of how the RASP tool works (whether via embedded runtime libraries, servlet filters, in-memory execution monitoring, or virtualized code paths), it protects applications by detecting attacks in live runtime behavior. This model essentially provides execution path scanning, monitoring all user requests and parameters. Unlike technologies which block requests at the network or web proxy layer, RASP inspects requests at the application layer, which means it has full access to the application’s inner workings. Working at the API layer provides better visibility to determine whether a request is malicious, and more focused blocking capabilities than external security products.
  • Runtime protection: Ultimately RASP is not just for testing, but for full runtime protection and blocking of attacks.

Regardless of where you deploy RASP, you need to test to ensure it is delivering on its promise. We advocate an ongoing testing process to ensure your policies are sound, and that you ultimately block what you need to block. Of course you can use other scanners to probe an application to ensure RASP is working prior to deployment, and other tools (such as Havij and SQLmap) to automate testing, but that’s only half the story. For full confidence that your apps are protected, we still recommend actual humans banging away at your applications. Penetration testing, at least periodically, helps verify your defenses are effective.

To WAF or not to WAF

Why did the market develop this brand-new security technology? Especially when existing technologies – most notably Web Application Firewalls (WAF) – already provided similar functions. Both block attacks on web-facing applications. They are both focused on known attack vectors, and include blacklists of attack patterns. Some optionally offer whitelists of known (approved) application functions. And both can ‘learn’ appropriate application behaviors. In fact most enterprises, especially which must comply with PCI-DSS, have already bought and deployed WAF. So why spend time and money on a new tool?

WAF management teams speak of the difficulty maintaining ‘positive’ security rules, and penetration testers grouse about how most WAFs are misconfigured, but neither was the primary driver of the search for an alternative which produced RASP. Development teams were looking for something different. Most stated their basic requirement was for something to work within their development pipeline. WAF’s lack of APIs for automatic setup, the time needed to learn application behavior, and most importantly the ability to pinpoint vulnerable code modules, were all cited as reasons WAF failed to satisfy developers. Granted, these requests came from more ‘Agile’ teams, more often building new applications than maintaining existing platforms. Still, we heard consistently that RASP meets a market demand unsatsfied by other application security technologies.

It is important to recognize that these technologies can be complementary, not necessarily competitive. There is absolutely no reason you can’t run RASP alongside your existing WAF. Some organizations continue to use cloud-based WAF as front-line protection, while embedding RASP into applications. Some use WAF to provide “threat intelligence”, DoS protection, and network security, while using RASP to fine-tune application security. Still others double down with overlapping security functions, much the way many organizations use layered anti-spam filters, accepting redundancy for broader coverage or unique benefits from each product. WAF platforms have a good ten-year head start, with broader coverage and very mature platforms, so some firms are loath to throw away WAF until RASP is fully proven.

Tomorrow we will close out this series with a brief buyers guide. We look forward to your comments!

—Adrian Lane

Firestarter: Where to start?

By Rich

It’s long past the day we need to convince you that cloud and DevOps is a thing. We all know it’s happening, but one of the biggest questions we get is “Where do I start?” In this episode we scratch the surface of how to start approaching the problem when you don’t get to join a hot unicorn startup and build everything from scratch with an infinite budget behind you.

Watch or listen:


—Rich

Friday, May 27, 2016

Incident Response in the Cloud Age: Addressing the Skills Gap

By Mike Rothman

As we described in our last post, incident response in the Cloud Age requires an evolved response process, in light of data sources you didn’t have before, including external threat intelligence, and the ability to analyze data in ways that weren’t possible just a few years ago. You also need to factor in the fact that access to specific telemetry, especially around the network, is limited because you don’t have control over networks anymore.

But even with these advances, the security industry needs to face the intractable problem that comes up in pretty much every discussion we have with senior security types. It’s people, folks. There simply are not enough skilled investigators (forensicators) to meet demand. And those who exist tend to hop from job to job, maximizing their earning potential. As they should – given free markets and all.

But this creates huge problems if you are running a security team and need to build and maintain a staff of analysts, hunters, and responders. So where can you find folks in a seller’s market? You have a few choices:

  1. Develop them: You certainly can take high-potential security professionals and teach them the art of incident response. Or given the skills gap, lower-potential security professionals. Sigh. This involves a significant investment in training, and a lot of the skills needed will be acquired in the crucible of an active incident.
  2. Buy them: If you have neither the time nor the inclination to develop your own team of forensicators, you can get your checkbook out. You’ll need to compete for these folks in an environment where consulting firms can keep them highly utilized, so they are willing to pay up for talent to keep their billable hours clicking along. And large enterprises can break their typical salary bands to get the talent they need as well. This approach is not cheap.
  3. Rent them: Speaking of consulting firms, you can also find forensicators by entering into an agreement with a firm that provides incident response services. Which seems to be every security company nowadays. It’s that free market thing again. This will obviously be the most expensive, because you are paying for the overhead of partners to do a bait and switch and send a newly minted SANS-certified resource to deal with your incident. OK, maybe that’s a little facetious. But only a bit.

The reality is that you’ll need all of the above to fully staff your team. Developing a team is your best long-term option, but understand that some of those folks will inevitably head to greener pastures right after you train them up. If you need to stand up an initial team you’ll need to buy your way in and then grow. And it’s a good idea to have a retainer in place with an external response firm to supplement your resources during significant incidents.

Changing the Game

It doesn’t make a lot of sense to play a game you know you aren’t going to win. Finding enough specialized resources to sufficiently staff your team probably fits into that category. So you need to change the game. Thinking about incident response differently covers a lot, including:

  • Narrow focus: As discussed earlier, you can leverage threat intelligence and security analytics to more effectively prioritize efforts when responding to incidents. Retrospectively searching for indicators of malicious activity and analyzing captured data to track anomalous activity enables you to focus efforts on those devices or networks where you can be pretty sure there are active adversaries.
  • On the job training: In all likelihood your folks are not yet ready to perform very sophisticated malware analysis and response, so they will need to learn on the job. Be patient with your I/R n00bs and know they’ll improve, likely pretty quickly. Mostly because they will have plenty of practice – incidents happen daily nowadays.
  • Streamline the process: To do things differently you need to optimize your response processes as well. That means not fully doing some things that, given more time and resources, you might. You need to make sure your team doesn’t get bogged down doing things that aren’t absolutely necessary, so it can triage and respond to as many incidents as possible.
  • Automate: Finally you can (and will need to) automate the I/R process where possible. With advancing orchestration and integration options as applications move to the cloud, it is becoming more feasible to apply large does of automation to remove a lot of the manual (and resource-intensive) activities from the hands of your valuable team members, letting machines do more of the heavy lifting.

Streamline and Automate

You can’t do everything. You don’t have enough time or people. Looking at the process map in our last post, the top half is about gathering and aggregating information, which is largely not a human-dependent function. You can procure threat intelligence data and integrate that directly into your security monitoring platform, which is already collecting and aggregating internal security data.

In terms of initial triage and sizing up incidents, this can be automated to a degree as well. We mentioned triggered capture, so when an alert triggers you can automatically start collecting data from potentially impacted devices and networks. This information can be packaged up and then compared to known indicators of malicious or misuse activities (both internal and external), and against your internal baselines.

At that point you can route the package of information to a responder, who can start to take action. The next step is to quarantine devices and take forensic images, which can be largely automated as well. As more and more infrastructure moves into the cloud, software-defined networks and infrastructure can automatically take devices in question out of the application flow and quarantine them. Forensic images can be taken automatically with an API call, and added to your investigation artifacts. If you don’t have fully virtualized infrastructure, there are a number of automation and orchestration tools are appearing to provide an integration layer for these kinds of functions.

When it comes time to do damage assessment, this can largely be streamlined due to new technologies as well. As mentioned above, retrospective searching allows you to search your environment for known bad malware samples and behaviors consistent with the incident being investigated. That will provide clues to the timeline and extent of compromise. Compare this to the olden days (like a year ago, ha!) when you had to wait for the device to start doing something suspicious, and hope the right folks were looking at the console when bad behavior began.

In a cloud-native environment (where the application was built specifically to run in the cloud), there really isn’t any mitigation or cleanup required, at least on the application stack. The instances taken out of the application for investigation are replaced with known-good instances that have not been compromised. The application remains up and unaffected by the attack. Attacks on endpoints still require either cleanup or reimaging, although endpoint isolation technologies make it quicker and easier to get devices back up and running.

In terms of watching for the same attack moving forward, you can feed the indicators you found during the investigation back into your security analytics engine and watch for them as things happen, rather than after the attack. Your detection capabilities should improve with each investigation, thanks to this positive feedback loop.

Magnify Impact

It also makes sense to invest in an incident response management system/platform that will structure activities in a way that standardizes your response process. These response workflows make sure the right stuff happens during every response, because the system requires it. Remember, you are dealing with folks who aren’t as experienced, so having a set of tasks for them to undertake, especially when dealing with an active adversary, can ensure a full and thorough investigation happens. This kind of structure and process automation can magnify the impact of limited staff with limited skills.

It may seem harsh, but successful I/R in the Cloud Age requires you to think differently. You need to take inexperienced responders, and make them more effective and efficient. Using a scale of 1-10, you should look for people ranked 4-6. Then with training, structured I/R process, and generous automation, you may be able to have then function at a level of 7-8, which is a huge difference in effectiveness.

—Mike Rothman

Wednesday, May 25, 2016

Incite 5/25/2016: Transitions

By Mike Rothman

I have always been pretty transparent about my life in the Incite. I figured maybe readers could learn something that helps them in life through my trials and tribulations, and if not perhaps they’d be entertained a bit. I also write Incites as a journal of sorts for myself. A couple times a year I search through some old Incites and remember where I was at that point in my life. There really wasn’t much I wouldn’t share, but I wondered if at some point I’d find a line I wouldn’t cross in writing about my life publicly.

It turns out I did find that line. I have alluded to significant changes in my life a few times over the past two years, but I never really got into specifics. I just couldn’t. It was too painful. Too raw. But time heals, and over the past weekend I realized it was time to tell more of the story. Mostly because I could see that my kids had gone through the transition along with me, and we are all doing great.

transitions

So in a nutshell, my marriage ended. There aren’t a lot of decisions that are harder to make, especially for someone like me. I lived through a pretty contentious divorce as a child and I didn’t want that for me, my former wife, or our kids. So I focused for the past three years on treating her with dignity and kindness, being present for my kids, and keeping the long-term future of those I care about most at the forefront of every action I took.

I’m happy to say my children are thriving. The first few months after we told them of the imminent split were tough. There were lots of tears and many questions I couldn’t or wouldn’t answer. But they came to outward acceptance quickly. They helped me pick out my new home, and embraced the time they had with me. They didn’t act out with me, their Mom, or their friends, didn’t get into trouble, and did very well in school. They have ridden through a difficult situation well and they still love me. Which was all I could have hoped for.

Holidays are hard. They were with their Mom for Memorial Day and Thanksgiving last year, which was weird for me. Thankfully I have some very special people in my life who welcomed me and let me celebrate those holidays with them, so I wasn’t alone. We’ve adapted and are starting to form new rituals in our new life. We took a great trip to Florida for winter break last December, and last summer we started a new tradition, an annual summer beach trip to the Jersey Shore to spend Father’s Day with my Dad.

To be clear, this isn’t what they wanted. But it’s what happened, and they have made the best of it. They accepted my decision and accept me as I am right now. I’ve found a new love, who has helped me be the best version of myself, and brought happiness and fulfillment to my life that I didn’t know was possible. My kids have welcomed her and her children into our lives. They say kids adapt to their situation, and I’m happy to say mine have. I believe you see what people are made of during difficult times. A lot of those times happen to be inevitable transitions in life. Based on how they have handled this transition, my kids are incredible, and I couldn’t be more proud of them.

And I’m proud of myself for navigating the last couple years the best I could. With kindness and grace.

–Mike

Photo credit: “Transitions from Arjan Almekinders


Security is changing. So is Securosis. Check out Rich’s post on how we are evolving our business.

We’ve published this year’s Securosis Guide to the RSA Conference. It’s our take on the key themes of this year’s conference (which is really a proxy for the industry), as well as deep dives on cloud security, threat protection, and data security. And there is a ton of meme goodness… Check out the blog post or download the guide directly (PDF).

The fine folks at the RSA Conference posted the talk Jennifer Minella and I did on mindfulness at the 2014 conference. You can check it out on YouTube. Take an hour. Your emails, alerts, and Twitter timeline will be there when you get back.


Securosis Firestarter

Have you checked out our video podcast? Rich, Adrian, and Mike get into a Google Hangout and… hang out. We talk a bit about security as well. We try to keep these to 15 minutes or less, and usually fail.


Heavy Research

We are back at work on a variety of blog series, so here is a list of the research currently underway. Remember you can get our Heavy Feed via RSS, with our content in all its unabridged glory. And you can get all our research papers too.

Evolving Encryption Key Management Best Practices

Incident Response in the Cloud Age

Understanding and Selecting RASP

Maximizing WAF Value

Resilient Cloud Network Architectures

Shadow Devices

Building a Vendor IT Risk Management Program

Recently Published Papers


Incite 4 U

  1. Embrace and Extend: AWS is this generation’s version of Windows. Sure, there are other cloud providers like Microsoft Azure and Google, but right now AWS is king of the hill. And there are some similarities to how Microsoft behaved in the early 90s. Do you remember when Microsoft would roll new functions into Windows, and a handful of third-party utility vendors would go away? Yeah, that’s AWS today. but faster. Amazon rolls out new features and services monthly, and inevitably those new capabilities step on third parties. How did folks compete with Microsoft back in the day? Rich reminded me a few months about that these vendors needed their own version of embrace and extend. They have to understand that the gorilla is going to do what they do, so to survive smaller vendors must continually push functionality forward and extend their offerings. Ben Kepes at NetworkWorld asked whether a third-party vendor was really necessary, and then that vendor approached him to tell him their plans to stay relevant. Maybe the small fry makes it. Maybe they don’t. But that dynamic is driving the public cloud. Innovation happens within third parties, and at some point, if it’s a universal requirement, cloud providers will either buy the technology or build it themselves. That’s the way it has always been, and it won’t be different this time. – MR

  2. Signatures, exposed: Dan Guido offers a scathing review of the 2016 Verizon Data Breach Report (DBIR here). It’s a bit long but worth the read, as he walks through flaws in the report. In a nutshell, it’s a classic case in overweighting the data you have: signatures. And ignoring data you don’t have: actual exploit vectors! Worse, some of the vulnerability data is based on false positives, which further skew the results. As in years past, we think the DBIR does provide some valuable insights, and we still encourage you to look through the data and come to your own conclusions. In the meantime, the security PR hype machine will be taking sound bites and trumpeting them as the reason you must hurry up and buy their product, because the DBIR says so! – AL

  3. Jacking up your vendors… You realize that buying security products, and any products for that matter, is a game, right? Those who play the game can get better pricing or additional services or both. Vendors don’t like you to know about the game, but experienced procurement people do. Those who have been on the other side of a slick salesperson learned the game the hard way. Back in my Security Incite days I wrote a companion piece to the Pragmatic CSO about 10 years ago, focused on how to buy security products. Jeremiah Grossman, now that he doesn’t work for a vendor any more, has given you his perspective on how to play the game. His tips are on the money, although I look at multi-year deals as the absolute last tactic to use for price concessions. With the rate of change in security, the last thing I want to do is lock into a multi-year deal on technology that is certain to change. The other issue is being a customer reference. You can dangle that, and maybe the vendor will believe you. But ultimately your general counsel makes that decision. – MR

  4. Of dinosaurs and elephants: Peter Bailis over at Stanford had a wonderful post on How To Make Fossils Productive Again. With cheap compute resources and virtually free big data systems available to anyone with an Internet connection, we are seeing a huge uptake in data analytics. Left behind are the folks who cling tightly to relational databases, doing their best mainframe hugger impersonations. With such a dearth of big data managers (also known as data scientists) available, it’s silly that many people from the relational camp have been unwilling to embrace the new technologies. They seem to forget that these new technologies create new benchmarks for architectural ideals and propel us into the future. Peter’s advice to those relational folks? Don’t be afraid to rethink your definition of what a database is, and embrace the fact that these new platforms are designed to solve whole classes of problems outside the design scope of the relational model. You are likely to have fun doing so. – AL

  5. You can fool some of them, but not Rob: The good thing about the Internet and security in general is that there are very smart people out there who both test your contentions and call you out when you are full of crap. Some are trolls, but many are conscientious individuals focused on getting to the truth. Rob Graham is one of the good ones. He test things people say, and calls them out when they are not true. If you don’t read his blog, Errata Security, you are missing out. One of his latest missives is a pretty brutal takedown of the guy claiming to have started BitCoin. Rob actually proves, with code and all, that the guy isn’t who he says he is. Or maybe he is, but he hasn’t adequately proven it. Anyhow, without getting into arcane technology, read that post to see a master at work. – MR

  6. When I say it’s you, I really mean me: The folks who work on MongoDB, under fire in the press for some hacked databases, implied that MongoDB is secure, but some users are idiots. Maybe I missed the section in my business management class on the logic and long-term value of calling your customers idiots – they might be right, but that does not mean this will end well. In the big data and NoSQL market, I give the MongoDB team a lot of credit for going from zero security to a halfway decent mix of identity and platform security measures. That said, they have a ways to go. MongoDB is well behind the commercial Hadoop variants like Cloudera, Hortonworks, and MapR, and they lack the steady stream of security contributions the open source community is building for Hadoop. If the Mongo team would like to protect their idiots users in the future, they could write a vulnerability scanner to show users where they have misconfigured the database! It would be easy, and show people (including any idiots) their simple configuration errors. – AL

—Mike Rothman

Understanding and Selecting RASP: Use Cases

By Adrian Lane

As you might expect, the primary function of RASP is to protect web applications against known and emerging threats; it is typically deployed to block attacks at the application layer, before vulnerabilities can be exploited. There is no question that the industry needs application security platforms – major new vulnerabilities are disclosed just about every week. And there are good reasons companies look to outside security vendors to help protect their applications. Most often we hear that firms simply have too many critical vulnerabilities to fix in a timely manner, with many reporting their backlog would take years to fix. In many cases the issue is legacy applications – ones which probably should never have been put on the Internet. These applications are often unsupported, with the engineers who developed them no longer available, or the platforms so fragile that they become unstable if security fixes are applied. And in many cases it is simply economics: the cost of securing the application itself is financially unfeasible, so companies are willing to accept the risk, instead choosing to address threats externally as best they can.

But if these were the only reasons, organizations could simply use one of the many older technologies to application security, rather than needing RASP. Astute readers will notice that these are, by and large, the classic use cases for Intrusion Detection Systems (IDS) and Web Application Firewalls (WAFs). So why do people select RASP in lieu of more mature – and in many cases already purchased and deployed – technologies like IDS or WAF?

The simple answer is that the use cases are different enough to justify a different solution. RASP integrates security one large step from “security bolted on” toward “security from within”. But to understand the differences between use cases, you first need to understand how user requirements differ, and where they are not adequately addressed by those older technologies. The core requirements above are givens, but the differences in how RASP is employed are best illustrated by a handful of use cases.

Use Cases

  • APIs & Automation: Most of our readers know what Application Programming Interfaces (APIs) are, and how they are used. Less clear is the greatly expanding need for programatic interfaces in security products, thanks to application delivery disruptions caused by cloud computing. Cloud service models – whether deployment is private, public, or hybrid – enable much greater efficiencies as networks, servers, and applications can all be constructed and tested as software. APIs are how we orchestrate building, testing, and deployment of applications. Security products like RASP – unlike IDS and most WAFs – offer their full platform functionality via APIs, enabling software engineers to work with RASP in the manner their native metaphor.
  • Development Processes: As more application development teams tackle application vulnerabilities within the development cycle, they bring different product requirements than IT or security teams applying security controls post-deployment. It’s not enough for security products to identify and address vulnerabilities – they need to fit the development model. Software development processes are evolving (notably via continuous integration, continuous deployment, and DevOps) to leverage advantages of virtualization and cloud services. Speed is imperative, so RASP embedded within the application stack, providing real-time monitoring and blocking, supports more agile approaches.
  • Application Awareness: As attackers continue to move up the stack, from networks to servers and then to applications, it is becoming more distinguish attacks from normal usage. RASP is differentiated by its ability to include application context in security policies. Many WAFs offer ‘positive’ security capabilities (particularly whitelisting valid application requests), but being embedded within applications provides additional application knowledge and instrumentation capabilities to RASP deployments. Further, some RASP platforms help developers by specifically reference modules or lines of suspect code. For many development teams, potentially better detection capabilities are less valuable than having RASP pinpoint vulnerable code.
  • Pre-Deployment Validation: For cars, pacemakers, and software, it has been proven over decades that the earlier in the production cycle errors are discovered, the easier – and cheaper – they are to fix. This means testing in general, and security testing specifically, works better earlier into the development process. Rather than relying on vulnerability scanners and penetration testers after an application has been launched, we see more and more application security testing performed prior to deployment. Again, this is not impossible with other application-centric tools, but RASP is easier to build into automated testing.

Our next post will talk about deployment, and working RASP into development pipelines.

—Adrian Lane

Tuesday, May 24, 2016

Incident Response in the Cloud Age: More Data, No Data, or Both?

By Mike Rothman

As we discussed in the first post of this series, incident response needs to change, given disruptions such as cloud computing and the availability of new data sources, including external threat intelligence. We wrote a paper called Leveraging Threat Intelligence in Incident Response (TI+IR) back in 2014 to update our existing I/R process map. Here is what we came up with:

TIIR Process map

So what has changed in the two years since we published that paper? Back then the cloud was nascent and we didn’t know if DevOps was going to work. Today both the cloud and DevOps are widely acknowledged as the future of computing and how applications will be developed and deployed. Of course we will take a while to get there, but they are clearly real already, and upending pretty much all the existing ways security currently works, including incident response.

The good news is that our process map still shows how I/R can leverage additional data sources and the other functions involved in performing a complete and thorough investigation. Although it is hard to get sufficient staff to fill out all the functions described on the map. But we’ll deal with that in our next post. For now let’s focus on integrating additional data sources including external threat intelligence, and handling emerging cloud architectures.

More Data (Threat Intel)

We explained why threat intelligence matters to incident response in our TI+IR paper:

To really respond faster you need to streamline investigations and make the most of your resources, a message we’ve been delivering for years. This starts with an understanding of what information would interest attackers. From there you can identify potential adversaries and gather threat intelligence to anticipate their targets and tactics. With that information you can protect yourself, monitor for indicators of compromise, and streamline your response when an attack is (inevitably) successful.

You need to figure out the right threat intelligence sources, and how to aggregate the data and run the analytics. We don’t want to rehash a lot of what’s in the TI+IR paper, but the most useful information sources include:

  • Compromised Devices: This data source provides external notification that a device is acting suspiciously by communicating with known bad sites or participating in botnet-like activities. Services are emerging to mine large volumes of Internet traffic to identify such devices.
  • Malware Indicators: Malware analysis continues to mature rapidly, getting better and better at understanding exactly what malicious code does to devices. This enables you to define both technical and behavioral indicators, across all platforms and devices to search for within your environment, as described in gory detail in Malware Analysis Quant.
  • IP Reputation: The most common reputation data is based on IP addresses and provides a dynamic list of known bad and/or suspicious addresses based data such as spam sources, torrent usage, DDoS traffic indicators, and web attack origins. IP reputation has evolved since its introduction, and now features scores comparing the relative maliciousness of different addresses, factoring in additional context such as Tor nodes/anonymous proxies, geolocation, and device ID to further refine reputation.
  • Malicious Infrastructure: One specialized type of reputation often packaged as a separate feed is intelligence on Command and Control (C&C) networks and other servers/sources of malicious activity. These feeds track global C&C traffic and pinpoint malware originators, botnet controllers, compromised proxies, and other IP addresses and sites to watch for as you monitor your environment.
  • Phishing Messages: Most advanced attacks seem to start with a simple email. Given the ubiquity of email and the ease of adding links to messages, attackers typically find email the path of least resistance to a foothold in your environment. Isolating and analyzing phishing email can yield valuable information about attackers and tactics.

As depicted in the process map above, you integrate both external and internal security data sources, then perform analytics to isolate the root cause of the attacks and figure out the damage and extent of the compromise. Critical success factors in dealing with all this data are the ability to aggregate it somewhere, and then to perform the necessary analysis.

This aggregation happens at multiple layers of the I/R process, so you’ll need to store and integrate all the I/R-relevant data. Physical integration is putting all your data into a single store, and then using it as a central repository for response. Logical integration uses valuable pieces of threat intelligence to search for issues within your environment, using separate systems for internal and external data. We are not religious about how you handle it, but there are advantages to centralizing all data in one place. So as long as you can do your job, though – collecting TI and using it to focus investigation – either way works. Vendors providing big data security all want to be your physical aggregation point, but results are what matters, not where you store data.

Of course we are talking about a huge amount of data, so your choices for both data sources and I/R aggregation platform are critical parts of building an effective response process.

No Data (Cloud)

So what happens to response now that you don’t control a lot of the data used by your corporate systems? The data may reside with a Software as a Service (SaaS) provider, or your application may be deployed in a cloud computing service. In data centers with traditional networks it’s pretty straightforward to run traffic through inspection points, capture data as needed, and then perform forensic investigation. In the cloud, not so much.

To be clear, moving your computing to the cloud doesn’t totally eliminate your ability to monitor and investigate your systems, but your visibility into what’s happening on those systems using traditional technologies is dramatically limited.

So the first step for I/R in the cloud has nothing to do with technology. It’s all about governance. Ugh. I know most security professionals just felt a wave of nausea hit. The G word is not what anyone wants to hear. But it’s pretty much the only way to establish the rules of engagement with cloud service providers. What kinds of things need to be defined?

  1. SLAs: One of the first things we teach in our cloud security classes is the need to have strong Service Level Agreements (SLAs) with cloud providers. And these SLAs need to be established before you sign a deal. You don’t have much leverage during negotiations, but you have none after you signed. The kinds of SLAs include response time, access to specific data types, proactive alerts (them telling you when they had an issue), etc. We suggest you refer to the Cloud Security Alliance Guidance for specifics about proper governance structures for cloud computing.
  2. Hand-offs and Escalations: At some point there will be an issue, and you’ll need access to data the cloud provider has. How will that happen? The time to work through these issues is not while your cloud technology stack is crawling with attackers. Like all aspects of I/R, practice makes pretty good – there is no such thing as perfect. That means you need to practice your data gathering and hand-off processes with your cloud providers. The escalation process within the service provider also needs to be very well defined to make sure you can get adequate response under duress.

Once the proper governance structure is in place, you need to figure out what data is available to you in the various cloud computing models. In a SaaS offering you are pretty much restricted to logs (mostly activity, access, and identity logs) and information about access to the SaaS provider’s APIs. This data is quite limited, but can help figure out whether an employee’s account has been compromised, and what actions the account performed. Depending on the nature of the attack and the agreement with your SaaS provider, you may also be able to get some internal telemetry, but don’t count on that.

If you run your applications in an Infrastructure as a Service (IaaS) environment you will have access to logs (activity, access, and identity) of your cloud infrastructure activity at a granular level. Obviously a huge difference from SaaS is that you control the servers and networks running in your IaaS environment, so you can instrument your application stacks to provide granular activity logging, and route network traffic through an inspection/capture point to gather network forensics. Additionally many of the IaaS providers have fairly sophisticated offerings to provide configuration change data and provide light security assessments to pinpoint potential security issues, both of which are useful during incident response.

Those running private or hybrid clouds connecting to cloud environments at an IaaS provider, as well as your own data center, will also have access to logs generated by virtualization infrastructure. As we alluded before, regardless of where the application runs, you can (and should) be instrumenting the application itself to provide granular logging and activity monitoring to detect misuse. With the limited visibility in the cloud, you really don’t have a choice but to both build security into your cloud technology stacks, and make sure you are able to generate application logs to provide sufficient data to support an investigation.

Capture the Flag

In the cloud, whether it’s SaaS, IaaS, or hybrid cloud, you are unlikely to get access to the full network packet stream. You will have access to the specific instances running in the cloud (whether SaaS or hybrid cloud), but obviously the type of telemetry you can gather will vary. So how much forensics information is enough?

  • Full Network Packet Capture: Packets are useful for knowing exactly what happened and being able to reconstruct and play back sessions. To capture packets you need either virtual taps to redirect network traffic to capture points, or to run network traffic through sensors in the cloud. But faster networks and less visibility are making full packet capture less feasible.
  • Capture and Release: This approach involves capturing the packet stream and deriving metadata about network traffic dynamics, and content in the stream as well. It’s more efficient because you aren’t necessarily keeping the full data stream, but get a lot more information than can be gleaned from network flows. This still requires inline sensors or virtual taps to capture traffic before releasing it.
  • Triggered Capture: When a suspicious alert happens you may want to capture the traffic and logs before and after the alert on the devices/networks in question. That requires at least a capture and release approach (to get the data), and provides flexibility to only capture when you think something is important, so it’s more efficient that full network packet capture.
  • Network Flows: It will be increasingly common to get network flow data, which provides source and destination information for network traffic through your cloud environment, and enables you to see if there was some kind of anomalous activity prior to the compromise.
  • Instance Logs: The closest analogy is the increasingly common endpoint detection and forensics offerings. If you deploy them within your cloud instances, you can figure out what happened, but may lack context on who and why unless you also fully capturing device activity. Also understand that these tools will need to be updated to handle the nuances of working in the cloud, including autoscaling and virtual networking.

We’ve always been fans of more rather than less data. But as we move into the Cloud Age practitioners need to be much more strategic and efficient about how and where to get data to drive incident response. It will come from external sources, as well as some logical sensors and capture points within the clouds (both public and private) in use. The increasing speed of networks and telemetry available from instances/servers, especially in data centers, will continue to challenge the scale of data collection infrastructure, so scale is a key consideration for I/R in the Cloud Age.

All this I/R data now requires technology that can actually analyze it within a reasonable timeframe. We hear a lot about “big data” for security monitoring these days. Regardless of what it’s called by the industry hype machine, you need technologies to index, search through, and find patterns within data – even when you don’t know exactly what you’re looking for, to start. Fortunately other industries – including retail – have been analyzing data to detect unseen and unknown patterns for years (they call it “business intelligence”), and many of their analytic techniques are available to security.

This scale issue is compounded by cloud usage requiring highly distributed collection infrastructure, which makes I/R collection more art than science, so you need to be constantly learning how much data is enough. The process feedback loop is absolutely critical to make sure that when the right data is not captured, the process evolves to collect the necessary infrastructure telemetry, and instrument applications to ensure sufficient visibility for thorough investigation.

But in the end, incident response always depends on people to some degree. That’s the problem nowadays, so our next post will tackle talent for incident response, and the potential shifts as cloud computing continues to take root.

—Mike Rothman

Monday, May 23, 2016

Evolving Encryption Key Management Best Practices: Introduction

By Rich

This is the first in a four-part series on evolving encryption key management best practices. This research is also posted at GitHub for public review and feedback. My thanks to Hewlett Packard Enterprise for licensing this research, in accordance with our strict Totally Transparent Research policy, which enables us to release our independent and objective research for free.

Data centers and applications are changing; so is key management.

Cloud. DevOps. Microservices. Containers. Big Data. NoSQL.

We are in the midst of an IT transformation wave which is likely the most disruptive since we built the first data centers. One that’s even more disruptive than the first days of the Internet, due to the convergence of multiple vectors of change. From the architectural disruptions of the cloud, to the underlying process changes of DevOps, to evolving Big Data storage practices, through NoSQL databases and the new applications they enable.

These have all changed how we use a foundational data security control: encryption. While encryption algorithms continue their steady evolution, encryption system architectures are being forced to change much faster due to rapid changes in the underlying infrastructure and the applications themselves. Security teams face the challenge of supporting all these new technologies and architectures, while maintaining and protecting existing systems.

Within the practice of data-at-rest encryption, key management is often the focus of this change. Keys must be managed and distributed in ever-more-complex scenarios, at the same time there is also increasing demand for encryption throughout our data centers (including cloud) and our application stacks.

This research highlights emerging best practices for managing encryption keys for protecting data at rest in the face of these new challenges. It also presents updated use cases and architectures for the areas where we get the most implementation questions. It is focused on data at rest, including application data; transport encryption is an entirely different issue, as is protecting data on employee computers and devices.

How technology evolution affects key management

Technology is always changing, but there is a reasonable consensus that the changes we are experiencing now are coming faster than even the early days of the Internet. This is mostly because we see a mix of both architectural and process changes within data centers and applications. The cloud, increased segregation, containers, and micro services, all change architectures; while DevOps and other emerging development and operations practices are shifting development and management practices. Better yet (or worse, depending on your perspective), all these changes mix and reinforce each other.

Enough generalities. Here are the top trends we see impacting data-at-rest encryption:

  • Cloud Computing: The cloud is the single most disruptive force affecting encryption today. It is driving very large increases in encryption usage, as organizations shift to leverage shared infrastructure. We also see increased internal use of encryption due to increased awareness, hybrid cloud deployments, and in preparation for moving data into the cloud.

    The cloud doesn’t only affect encryption adoption – it also fundamentally influences architecture. You cannot simply move applications into the cloud without re-architecting (at least not without seriously breaking things – and trust us, we see this every day). This is especially true for encryption systems and key management, where integration, performance, and compliance all intersect to affect practice.

  • Increased Segmentation: We are far past the days when flat data center architectures were acceptable. The cloud is massively segregated by default, and existing data centers are increasingly adding internal barriers. This affects key management architectures, which now need to support different distribution models without adding management complexity.
  • Microservice architectures: Application architectures themselves are also becoming more compartmentalized and distributed as we move away from monolithic designs into increasingly distributed, and sometimes ephemeral, services. This again increases demand to distribute and manage keys at wider scale without compromising security.
  • Big Data and NoSQL: Big data isn’t just a catchphrase – it encompasses a variety of very real new data storage and processing technologies. NoSQL isn’t necessarily big data, but has influenced other data storage and processing as well. For example, we are now moving massive amounts of data out of relational databases into distributed file-system-based repositories. This further complicates key management, because we need to support distributed data storage and processing on larger data repositories than ever before.
  • Containers: Containers continue the trend of distributing processing and storage (noticing a theme?), on an even more ephemeral basis, where containers might appear in microseconds and disappear in minutes, in response to application and infrastructure demands.
  • DevOps: To leverage these new changes and increase effectiveness and resiliency, DevOps continues to emerge as a dominant development and operational framework – not that there is any single definition of DevOps. It is a philosophy and collection of practices that support extremely rapid change and extensive automation. This makes it essential for key management practices to integrate, or teams will simply move forward without support.

These technologies and practices aren’t mutually exclusive. It is extremely common today to build a microservices-based application inside containers running at a cloud provider, leveraging NoSQL and Big Data, all managed using DevOps. Encryption may need to support individual application services, containers, virtual machines, and underlying storage, which might connect back to an existing enterprise data center via a hybrid cloud connection.

It isn’t always this complex, but sometimes it is. So key management practices are changing to keep pace, so they can provide the right key, at the right time, to the right location, without compromising security, while still supporting traditional technologies.

—Rich

Friday, May 20, 2016

Incite 5/20/2016: Dance of Joy

By Mike Rothman

Perception of time is a funny thing. As we wind down the school year in Atlanta, it’s hard to believe how quickly this year has flown by. It seems like yesterday XX1 was starting high school and the twins were starting middle school. I was talking to XX1 last week as she was driving herself to school (yes, that’s a surreal statement) and she mentioned that she couldn’t believe the school year was over. I tried to explain that as you get older, time seems to move more quickly.

The following day I was getting a haircut with the Boy and our stylist was making conversation. She asked him if the school year seemed to fly by. He said, “Nope! It was sooooo slow.” They are only 3 years apart, but clearly the perception of time changes as tweens become teens.

The end of the school year always means dance recitals. For over 10 years now I’ve been going to recitals to watch my girls perform. From when they were little munchies in their tiny tutus watching the teacher on the side of the stage pantomiming the routine, to now when they both are advanced dancers doing 7-8 routines each year, of all disciplines. Ballet (including pointe), Jazz, Modern, Tap, Lyrical. You name it and my girls do it.

modern dance

A lot of folks complain about having to go to recitals. I went to all 3 this year. There is no place I’d rather be. Watching my girls dance is one of the great joys of my life. Seeing them grow from barely being able to do a pirouette to full-fledged dancers has been incredible. I get choked up seeing how they get immersed in performance, and how happy it makes them to be on stage.

Although this year represents a bit of a turning point. XX2 decided to stop dancing and focus on competitive cheerleading. There were lots of reasons, but it mostly came down to passion. She was serious about improving her cheerleading skills, constantly stretching and working on core strength to improve her performance. She was ecstatic when she made the 7th grade competitive cheer team at her school. But when it came time for dance she said, “meh.” So the choice was clear, although I got a little nostalgic watching her last dance recital. It’s been a good run and I look forward to seeing her compete in cheer.

I’m the first to embrace change and chase passions. When something isn’t working, you make changes, knowing full well that it requires courage – lots of people resist change. Her dance company gave her a bit of a hard time and the teachers weren’t very kind during her last few months at the studio. But it’s OK – people show themselves at some point, and we learned a lot about those people. Some are keepers, and XX2 will likely maintain those relationships as others fade away.

It’s just like life. You realize who your real friends are when you make changes. Savor those friendships and let all the others go. We have precious few moments – don’t waste them on people who don’t matter.

–Mike

Photo credit: “Korean Modern Dance” from Republic of Korea


Security is changing. So is Securosis. Check out Rich’s post on how we are evolving our business.

We’ve published this year’s Securosis Guide to the RSA Conference. It’s our take on the key themes of this year’s conference (which is really a proxy for the industry), as well as deep dives on cloud security, threat protection, and data security. And there is a ton of meme goodness… Check out the blog post or download the guide directly (PDF).

The fine folks at the RSA Conference posted the talk Jennifer Minella and I did on mindfulness at the 2014 conference. You can check it out on YouTube. Take an hour. Your emails, alerts, and Twitter timeline will be there when you get back.


Securosis Firestarter

Have you checked out our video podcast? Rich, Adrian, and Mike get into a Google Hangout and… hang out. We talk a bit about security as well. We try to keep these to 15 minutes or less, and usually fail.


Heavy Research

We are back at work on a variety of blog series, so here is a list of the research currently underway. Remember you can get our Heavy Feed via RSS, with our content in all its unabridged glory. And you can get all our research papers too.

Incident Response in the Cloud Age

Understanding and Selecting RASP

Maximizing WAF Value

Resilient Cloud Network Architectures

Shadow Devices

Building a Vendor IT Risk Management Program

Recently Published Papers


Incite 4 U

  1. The Weakest Link: Huge financial institutions spend a ton of money on security. They buy and try one of everything, and have thousands of security professionals to protect their critical information. And they still get hacked, but it’s a major effort for their adversaries. The attackers just don’t want to work that hard, so mostly they don’t. They find the weakest link, and it turns out that to steal huge sums from banks, you check banks without sophisticated security controls, but with access to the SWIFT fund transfer network. So if you were curious whether Bank of Bangladesh has strong security, now you know. They don’t. That bank was the entry point for an $81 million fraud involving SWIFT and the Federal Reserve Bank of NY. Everything looked legit, so the big shops thought they were making a proper fund transfer. And then the money was gone. Poof. With such interconnected systems running the global financial networks, this kind of thing is bound to happen. Probably a lot. – MR

  2. Racist in the machine: It shouldn’t be funny, but it is: Microsoft turned loose Tay Chatbot – a machine learning version of Microsoft Bob on the Internet. Within hours it became a plausible, creative racist ass***. Creative in that it learned, mostly pulling from cached Google articles, to evolve its own racism. Yes, all those Internet comment trolls taught the bot to spew irrational hatred, so well that it could pass for a Philadelphia sports fan (kidding). Some friends have pointed out other examples of chatbots on message boards claiming to be $DIETY as their learning engines did exactly what they were programmed to do. Some call it a reflection on society, as Tay learned people’s real behaviors, but it’s more likely its learning mode was skewed toward the wrong sources, with no ethics or logical validation. This is a good example of how easily things can go wrong in automated security event detection, heuristics, lexical analysis, metadata analysis, and machine learning. People can steer learning engines the wrong way, so don’t allow unfiltered user input, just like with any other application platform. – AL

  3. Double edged sword: The thing about technology is that almost every innovation can be used for good. Or bad. Take, for instance, PowerShell, the powerful Microsoft scripting language. As security becomes more software-defined by the day, scripting tools like PowerShell (and Python) are going to be the way much of security gets implemented in continuous deployment pipelines. But as the CarbonBlack folks discuss in this NetworkWorld article, they are also powerful tools for automating a bunch of malware functions. So you need to get back to the basics of security: defining normal behavior and then looking for anomalies, because the tools can be used for good and not-so-good. – MR

  4. Mastering the irrelevant: Visa stated some merchants see a dip in fraud due to chipped cards with 5 of the top 25 victims of forged cards seeing an 18.3% reduction in counterfeit transactions while non-compliant merchants saw an 11% increase. And they say over 70% of credit cards in US circulation now have chips, up from 20% at the October 2015 deadline. That’s great, but Visa is tap dancing around the real issue: why there is a measly 20% adoption rate among the top candidates for EMV fraud reduction. We understand that the majority of the 25 merchants referenced above have EMV terminals in place, but continue to point fingers at Visa and Mastercard’s failure to certify as the reason EMV is not fully deployed. Think about it this way: EMV does not stop a card cloner from using a non-chipped clone, because US terminals accept both card types. This is clearly not about security or fraud detection, but instead a self-promotional pat on the back to quiet their critics. If you’re impacted by EMV, you do want to migrate to enable mobile payments, which legitimately offer better customer affinity and security, and possibly lower fees. The rest is just noise. – AL

  5. More Weakest Link: Speaking of weak links, it turns out call centers are rife with fraud today. At least according to a research report from Pindrop, who really really wants phone fraud to increase. The tactics in this weak link are different than the Bangladesh attack above: call centers are being gamed using social engineering. But in both cases big money is at stake. One of their conclusions is that it’s getting hard for fraudsters to clone credit cards (with Chip and PIN), so they are looking for a weaker link. They found the folks in these call centers. And the beat goes on. – MR

—Mike Rothman

Summary: May 19, 2016

By Rich

Rich here.

Not a lot of news from us this week, because we’ve mostly been traveling, and for Mike and me the kids’ school year is coming to a close.

Last week I was at the Rocky Mountain Information Security Conference in Denver. The Denver ISSA puts on a great show, but due to some family scheduling I didn’t get to see as many sessions as I hoped. I presented my usual pragmatic cloud pitch, a modification of my RSA session from this year. It seems one of the big issues organizations are still facing is a mixture of where to get started on cloud/DevOps, with switching over to understand and implement the fundamentals.

For example, one person in my session mentioned his team thought they were doing DevOps, but actually mashed some tools together without understanding the philosophy or building a continuous integration pipeline. Needless to say, it didn’t go well.

In other news, our advanced Black Hat class sold out, but there are still openings in our main class I highlighted the course differences in a post.

You can subscribe to only the Friday Summary.

Top Posts for the Week

  • Another great post from the Signal Sciences team. This one highlights a session from DevOps Days Austin by Dan Glass of American Airlines. AA has some issues unique to their industry, but Dan’s concepts map well to any existing enterprise struggling to transition to DevOps while maintaining existing operations. Not everyone has the luxury of building everything from scratch. Avoiding the Dystopian Road in Software.
  • One of the most popular informal talks I give clients and teach is how AWS networking works. It is completely based on this session, which I first saw a couple years ago at the re:Invent conference – I just cram it into 10-15 minutes and skip a lot of the details. While AWS-specific, this is mandatory for anyone using any kind of cloud. The particulars of your situation or provider will differ, but not the issues. Here is the latest, with additional details on service endpoints: AWS Summit Series 2016 | Chicago – Another Day, Another Billion Packets.
  • In a fascinating move, Jenkins is linking up with Azure, and Microsoft is tossing in a lot of support. I am actually a fan of running CI servers in the cloud for security, so you can tie them into cloud controls that are hard to implement locally, such as IAM. Announcing collaboration with the Jenkins project.
  • Speaking of CI in the cloud, this is a practical example from Flux7 of adding security to Git and Jenkins using Amazon’s CodeDeploy. TL;DR: you can leverage IAM and Roles for more secure access than you could achieve normally: Improved Security with AWS CodeCommit.
  • Netflix releases a serverless Open Source SSH Certificate Authority. It runs on AWS Lambda, and is definitely one to keep an eye on: Netflix/bless.
  • AirBnB talks about how they integrated syslog into AWS Kinesis using osquery (a Facebook tool I think I will highlight as tool of the week): Introducing Syslog to AWS Kinesis via Osquery – Airbnb Engineering & Data Science.

Tool of the Week

osquery by Facebook is a nifty Open Source tool to expose low-level operating system information as a real-time relational database. What does that mean? Here’s an example that finds every process running on a system where the binary is no longer on disk (a direct example from the documentation, and common malware behavior):

SELECT name, path, pid FROM processes WHERE on_disk = 0;

This is useful for operations but it’s positioned as a security tool. You can use it for File Integrity Monitoring, real-time alerting, and a whole lot more. The site even includes ‘packs’ for common needs including OS X attacks, compliance, and vulnerability management.

Securosis Blog Posts this Week

Other Securosis News and Quotes

Another quiet week…

Training and Events

—Rich

Thursday, May 19, 2016

Incident Response in the Cloud Age: Shifting Foundations

By Mike Rothman

Since we published our React Faster and Better research and Incident Response Fundamentals, quite a bit has changed relative to responding to incidents. First and foremost, incident response is a thing now. Not that it wasn’t a discipline mature security organizations focused on before 2012, but since then a lot more resources and funding have shifted away from ineffective prevention towards detection and response. Which we think is awesome.

Of course, now that I/R is a thing and some organizations may actually have decent response processes, the foundation us is shifting. But that shouldn’t be a surprise – if you wanted a static existence, technology probably isn’t the best industry for you, and security is arguably the most dynamic part of technology. We see the cloud revolution taking root, promising to upend and disrupt almost every aspect of building, deploying and operating applications. We continue to see network speeds increase, putting scaling pressure on every aspect of your security program, including response.

The advent of threat intelligence, as a means to get smarter and leverage the experiences of other organizations, is also having a dramatic impact on the security business, particularly incident response. Finally, the security industry faces an immense skills gap, which is far more acute in specialized areas such as incident response. So whatever response process you roll out needs to leverage technological assistance – otherwise you have little chance of scaling it to keep pace with accelerating attacks.

This new series, which we are calling “Incident Response in the Cloud Age”, will discuss these changes and how your I/R process needs to evolve to keep up. As always, we will conduct this research using our Totally Transparent Research methodology, which means we’ll post everything to the blog first, and solicit feedback to ensure our positions are on point.

We’d also like to thank SS8 for being a potential licensee of the content. One of the unique aspects of how we do research is that we call them a potential licensee because they have no commitment to license, nor do they have any more influence over our research than you. This approach enables us to write the kind of impactful research you need to make better and faster decisions in your day to day security activities.

Entering the Cloud Age

Evidently there is this thing called the ‘cloud’, which you may have heard of. As we have described for our own business, we are seeing cloud computing change everything. That means existing I/R processes need to now factor in the cloud, which is changing both architecture and visibility.

There are two key impacts on your I/R process from the cloud. The first is governance, as your data now resides in a variety of locations and with different service providers. Various parties required to participate as you try to investigate an attack. The process integration of a multi-organization response is… um… challenging.

The other big difference in cloud investigation is visibility, or its lack. You don’t have access to the network packets in an Infrastructure as a Service (IaaS) environment, nor can you see into a Platform as a Service (PaaS) offering to see what happened. That means you need to be a lot more creative about gathering telemetry on an ongoing basis, and figuring out how to access what you need during an investigation.

Speed Kills

We have also seen a substantial increase in the speed of networks over the past 5 years, especially in data centers. So if network forensics is part of your I/R toolkit (as it should be) how you architect your collection environment, and whether you actually capture and store full packets, are key decisions. Meanwhile data center virtualization is making it harder to know which servers are where, which makes investigation a bit more challenging.

Getting Smarter via Threat Intelligence

Sharing attack data between organizations still feels a bit strange for long-time security professionals like us. The security industry resisted admitting that successful attacks happen (yes, that ego thing got in the way), and held the entirely reasonable concern that sharing company-specific data could provide adversaries with information to facilitate future attacks.

The good news is that security folks got over their ego challenges, and also finally understand they cannot stand alone and expect to understand the extent of the attacks that come at them every day. So sharing external threat data is now common, and both open source and commercial offerings are available to provide insight, which is improving incident response. We documented how the I/R process needs to change to leverage threat intelligence, and you can refer to that paper for detail on how that works.

Facing down the Skills Gap

If incident response wasn’t already complicated enough because of the changes described above, there just aren’t enough skilled computer forensics specialists (who we call forensicators) to meet industry demand. You cannot just throw people at the problem, because they don’t exist. So your team needs to work smarter and more efficiently. That means using technology more for gathering and analyzing data, structuring investigations, and automating what you can. We will dig into emerging technologies in detail later in this series.

Evolving Incident Response

Like everything else in security, incident response is changing. The rest of this series will discuss exactly how. First we’ll dig into the impacts of the cloud, faster and virtualized networks, and threat intelligence on your incident response process. Then we’ll dig into how to streamline a response process to address the lack of people available to do the heavy lifting of incident response. Finally we’ll bring everything together with a scenario that illuminates the concepts in a far more tangible fashion. So buckle up – it’s time to evolve incident response for the next era in technology: the Cloud Age.

—Mike Rothman

Wednesday, May 18, 2016

Understanding and Selecting RASP: Technology Overview

By Adrian Lane

This post will discuss technical facets of RASP products, including how the technology works, how it integrates into an application environment, and the advantages or disadvantages of each. We will also spend some time on which application platforms supported are today, as this is one area where each provider is limited and working to expand, so it will impact your selection process. We will also consider a couple aspects of RASP technology which we expect to evolve over next couple years.

Integration

RASP works at the application layer, so each product needs to integrate with applications somehow. To monitor application requests and make sense of them, a RASP solution must have access to incoming calls. There are several methods for monitoring either application usage (calls) or execution (runtime), each deployed slightly differently, gathering a slightly different picture of how the application functions. Solutions are installed into the code production path, or monitor execution at runtime. To block and protect applications from malicious requests, a RASP solution must be inline.

  • Servlet Filters & Plugins: Some RASP platforms are implemented as web server plug-ins or Java Servlets, typically installed into either Apache Tomcat or Microsoft .NET to process inbound HTTP requests. Plugins filter requests before they reach application code, applying detection rules to each inbound request received. Requests that match known attack signatures are blocked. This is a relatively simple approach for retrofitting protection into the application environment, and can be effective at blocking malicious requests, but it doesn’t offer the in-depth application mapping possible with other types of integration.
  • Library/JVM Replacement: Some RASP products are installed by replacing the standard application libraries, JAR files, or even the Java Virtual Machine. This method basically hijacks calls to the underlying platform, whether library calls or the operating system. The RASP platform passively ‘sees’ application calls to supporting functions, applying rules as requests are intercepted. Under this model the RASP tool has a comprehensive view of application code paths and system calls, and can even learn state machine or sequence behaviors. The deeper analysis provides context, allowing for more granular detection rules.
  • Virtualization or Replication: This integration effectively creates a replica of an application, usually as either a virtualized container or a cloud instance, and instruments application behavior at runtime. By monitoring – and essentially learning – application code pathways, all dynamic or non-static code is mimicked in the cloud. Learning and detection take place in this copy. As with replacement, application paths, request structure, parameters, and I/O behaviors can be ‘learned’. Once learning is complete rules are applied to application requests, and malicious or malformed requests are blocked.

Language Support

The biggest divide between RASP providers today is their platform support. For each vendor we spoke with during our research, language support was a large part of their product roadmap. Most provide full support for Java; beyond that support is hit and miss. .NET support is increasingly common. Some vendors support Python, PHP, Node.js, and Ruby as well. If your application doesn’t run on Java you will need to discuss platform support with vendors. Within the next year or two we expect this issue to largely go away, but for now it is a key decision factor.

Deployment Models

Most RASP products are deployed as software, within an application software stack. These products work equally well on-premise and in cloud environments. Some solutions operate fully in a cloud replica of the application, as in the virtualization and replicated models mentioned above. Still others leverage a cloud component, essentially sending data from an application instance to a cloud service for request filtering. What generally doesn’t happen is dropping an appliance into a rack, or spinning up a virtual machine and re-routing network traffic.

Detection Rules

During our interviews with vendors it became clear that most are still focused on negative security: they detect known malicious behavior patterns. These vendors research and develop attack signatures for customers. Each signature explicitly describes one attack, such as SQL injection or a buffer overflow. For example most products include policies focused on the OWASP Top Ten critical web application vulnerabilities, commonly with multiple policies to detect variations of the top ten threat vectors. This makes their rules harder for attackers to evade. And many platforms include specific rules for various Common Vulnerabilities and Exposures, providing the RASP platform with signatures to block known exploits.

Active vs. Passive Learning

Most RASP platforms learn about the application they are protecting. In some cases this helps to refine detection rules, adapting generic rules to match specific application requests. In other cases this adds fraud detection capabilities, as the RASP learns to ‘understand’ application state or recognize an appropriate set of steps within the application. Understanding state is a prerequisite for detecting business logic attacks and multi-part transactions. Other RASP vendors are just starting to leverage a positive (whitelisting) security model. These RASP solutions learn how API calls are exercised or what certain lines of code should look like, and block unknown patterns.

To do more than filter known attacks, a RASP tool needs to build a baseline of application behaviors, reflecting the way an application is supposed to work. There are two approaches: passive and active learning. A passive approach builds a behavioral profile as users use the application. By monitoring application requests over time and cataloging each request, linking the progression of requests to understand valid sequences of events, and logging request parameters, a RASP system can recognizes normal usage. The other baselining approach is similar to what Dynamic Application Security Testing (DAST) platforms use: by crawling through all available code paths, the scope of application features can be mapped. By generating traffic to exercise new code as it is deployed, application code paths can be synthetically enumerated do produce a complete mapping predictably and more quickly.

Note that RASP’s positive security capabilities are nascent. We see threat intelligence and machine learning capabilities as a natural fit for RASP, but these capabilities have not yet fully arrived. Compared to competing platforms, they lack maturity and functionality. But RASP is still relatively new, and we expect the gaps to close over time. On the bright side, RASP addresses application security use cases which competitive technologies cannot.

We have done our best to provide a detailed look at RASP technology, both to help you understand how it works and to differentiate it from other security products which sound similar. If you have questions, or some aspect of this technology is confusing, please comment below, and we will work to address your questions. A wide variety of platforms – including cloud WAF, signal intelligence, attribute-based fraud detection, malware detection, and network oriented intelligence services – all market value propositions which overlap with RASP. But unless the product can work in the application layer, it’s not RASP.

Next we will discuss emerging use cases, and why firms are looking for alternatives to what they have today.

—Adrian Lane