SOC 2025: The Coming SOC Evolution

It’s brutal running a security operations center (SOC) today. The attack surface continues to expand, in a lot of cases exponentially, as data moves to SaaS, applications move to containers, and the infrastructure moves to the cloud. The tools used by the SOC analysts are improving, but not fast enough. It seems adversaries remain one (or more) steps ahead. There aren’t enough people to get the job done. Those that you can hire typically need a lot of training, and retaining them continues to be problematic. As soon as they are decent, they head off to their next gig for a huge bump in pay. At the same time, security is under the spotlight like never before. Remember the old days when no one knew about security? Those days are long gone, and they aren’t coming back. Thus, many organizations embrace managed services for detection and response, mostly because they have to. Something has to change. Actually, a lot has to change. That’s what this series, entitled SOC 2025 is about. How can we evolve the SOC over the next few years to address the challenges of dealing with today’s security issues, across the expanded attack surface, with far fewer skilled people, while positioning for tomorrow? We want to thank Splunk(you may have heard of them) for agreeing to be the preliminary licensee for the research. That means when we finish up the research and assemble it as a paper, they will have an opportunity to license it. Or not. There are no commitments until the paper is done, in accordance with our Totally Transparent Research methodology. SOC, what’s it for? There tend to be two use cases main use cases for the SOC. Detecting, investigating, and remediating attacks and substantiating the controls for audit/compliance purposes. We are not going to cover the compliance use case in this series. Not because it isn’t important, audits are still a thing, and audit preparation should still be done in as efficient and effective a manner as possible. But in this series, we’re tackling the evolution of the Security OPERATIONS Center, so we’re going to focus on the detection, investigation, and remediation aspects of the SOC’s job. You can’t say (for most organizations anyway) there hasn’t been significant investment in security tooling over the past five years. Or ten years. Whatever your timeframe, security budgets have increased dramatically. Of course, there was no choice given the expansion of the attack surface and the complexity of the technology environment. But if the finance people objectively look at the spending on security, they can (and should) ask some tough questions about the value the organization receives from those significant investments. And there is the rub. We, as security professionals, know that there is no 100% security. That no matter how much you spend, you can (and will) be breached. We can throw out platitudes about reducing the dwell time or make the case that the attack would have been much worse without the investment. And you’re are probably right. But as my driver’s education teacher told me over 35 years ago, “you may be right, but you’ll still be dead.” What we haven’t done very well is manage to Security Outcomes and communicate the achievements. What do we need the outcome to be for our security efforts? Our mindset needs to shift from activity to outcomes. So what is the outcome we need from the SOC? We need to find and fix security issues before data loss. That means we have to sharpen our detection capabilities and dramatically improve and streamline our operational motions. There is no prize for finding all the vulnerabilities. Like there are no penalties for missing them. The SOC needs to master detecting, investigating, and turning that information into effective remediation before data is lost. Improved Tooling Once we’ve gotten our arms around the mindset shift in focusing on security outcomes, we can focus on the how. How is the SOC going to get better in detecting, investigating, and remediating attacks? That’s where better tooling comes into play. The good news is that SOC tools are much better than even five years ago. Innovations like improved analytics and security automation give SOCs far better capabilities. But only if the SOC uses them. What SOC leader in their right mind wouldn’t take advantage of these new capabilities? In concept, they all would and should. In reality, far too many haven’t and can’t. The problem is one of culture and evolution. The security team can handle detection and even investigation. But remediation is a cross-functional effort. And what do security outcomes depend on? You guessed it – remediation. So at its root, security is a team sport, and the SOC is one part of the team. This means addressing security issues needs to fit into the operational motions of the rest of the organization. The SOC can and should automate where possible, especially the things within their control. But most automation requires buy-in from the other operational teams. Ultimately if the information doesn’t consistently and effectively turn into action, the SOC fails in its mission. Focused Evolution In this series, we will deal with both internal and external evolution. We’ll start by turning inward and spending time understanding the evolution of how the SOC collects security telemetry from both internal and external sources. Given the sheer number of new data sources that much be considered (IaaS, PaaS, SaaS, containers, DevOps, etc.), making sure the right data is aggregated is the first step in the battle. Next, we’ll tackle detection and analytics since that is the lifeblood of the SOC. Again, you get no points for detecting things, but you’ve got no chance of achieving desired security outcomes if you miss attacks. The analytics area is where the most innovation has happened over the past few years, so we’ll dig into some use cases and help you understand how frameworks like ATT&CK and buzzy marketing terms like eXtended Detection and Response (XDR) should influence

Read Post

New Age Network Detection: Use Cases

As we wrap up the New Age Network Detection (NAND) series, we’ve made the point that network analysis remains critical to finding malicious activity, even as you move to the cloud. But clearly, collection and analysis need to change as the underlying technology platforms evolve. But that does put the cart a bit ahead of the horse. We haven’t spent much time honing in on the specific use cases where NAND makes a difference. So that’s how we’ll bring the series to a close. To be clear, this is not an exhaustive list of use cases, but it hits the high points and helps you understand the value of NAND relative to other means of detection. Ransomware Another day, another high-profile ransomware attack shutting down another major business. Every organization is a target and is vulnerable. So how do you get ahead of ransomware from a detection standpoint? First, let’s discuss what ransomware is and what it’s not. Ransomware involves the adversary compromising devices and then encrypting both the machine and shared file repositories to stop an organization from accessing their data unless they pay the ransom. But ransomware isn’t new, certainly not from an attack standpoint since it uses relatively common and commodity malware families for the initial compromise. To be clear, the attackers are more organized and have gotten more proficient once they’ve gained a foothold within a victim’s network, doing extensive recon to find and then destroy backups putting further pressure to pay the ransom. So what’s different now, making ransomware so urgent to address? It’s gotten mainstream press because of the high-profile attacks on pipeline companies and health care systems. When citizens can’t get gas and drive to places, and they can’t get critical care services because the medical systems at a hospital are down, that will get people’s attention, and it has. NAND helps in the initial stages of the ransomware attack. The adversary uses common malware families to compromise devices. As discussed in the last post, network telemetry can detect command and control traffic patterns and the recon activity within the environment. Additionally, as mentioned above, attackers now take the time to search for and destroy backups, which also involves network recon patterns that NAND can detect. Having your business unable to operate because you missed a ransomware attack is a career-limiting challenge for every CISO. Just out of self-preservation, stopping ransomware has become the single top priority for every CISO. The first step in addressing the ransomware scourge has a broad detection capability to maximize the likelihood of detecting the attack. The network is the first place you’ll see the emerging attack, as well as the ongoing recon and proliferation of the attack to compromise additional devices. Thus NAND is critical to ransomware defense. Threat Hunting Threat hunting is proactively looking for attackers in your environment before you get an alert from one of your other detection methods. Unfortunately, most organizations have active attackers in their environments, but they don’t know where or what they are doing until the attackers screw up and trigger an alert. Hunting can identify these attackers and smoke them out before a traditional alert fires, but only if you have sufficient telemetry and know where to look. Hunting does involve more art than science since the hunter needs to start with an idea of what types of attacks to look for. Then they must effectively and efficiently mine through the security data to find and follow the attacker’s trail. But we shouldn’t minimize the importance of the science part of it: having the data you need and a set of tools to navigate security data. That’s where NAND comes into play. By providing a broad and deep collection capability (including full packets, where necessary) and the ability to effectively pivot through the data both via search and by clicking through live links in the interface to follow the path the attacker may have taken. To be clear, NAND will not make a noob who has no idea what they are doing into a world-class hunter. Still, it can accelerate and improve any hunt in the hands of a reasonably capable security professional. Further helping the hunter are common hunting queries, typically pre-loaded into the detection tool to kick-start any hunting effort. Again, these rules don’t make the hunt, but they can codify common searches that uncover malicious activities, including drive-by attacks, spearphishing, privilege escalation, credential stuffing, and lateral movement. If there is a use case that provides significant value to security executives, it’s hunting. Although not all organizations have the resources to devote staff to hunting, those that do can find attackers before significant damage happens. And this makes the security team look good. Insider Threat Insider threat attacks have gotten a lot of visibility within the executive suite as well. The old “inside job” typically involves an employee acting maliciously to steal data or sabotage systems. But we use a broader definition of insiders to include any entity with a presence inside the network. Thus, during most attacks, an insider has access to the internal networks and resources. So how do you detect insider threats? We laid out how NAND facilitates the collection and analysis of your network telemetry, so we’ll leverage those capabilities. Although insiders can be anywhere, so you’ll need a broad collection effort, and that will include telemetry from any remote employees and cloud resources. From an analysis standpoint, looking for anomalies from the network traffic baseline will be the strongest indication of malicious activity. Focusing on the impact of the insider threat on the business (and the longevity of the CISO), an insider attack is particularly damaging. An employee insider may have access to all sorts of systems and proprietary data and have the wherewithal (especially for IT insiders) to take down the systems, leave back doors, delete data, and otherwise damage the organization. This use case forces you to question trust because insiders are trusted to do the right thing and have

Read Post

Papers Posted

It turns out that we are still writing papers and posting them in our research library, even though far less frequently than back in the day. Working with enterprises on their cloud security strategies consumes most of our cycles nowadays. When we’re not assessing clouds or training on clouds or getting into trouble, we’ve published 3 papers over the past year. I’ve finally posted them to the research library for you to check out. Data Security in the SaaS Age: In this paper, licensed by AppOmni, we dust off the Data Security Triangle and then proceed to provide a structure to rethink what data security looks like when you don’t control the data in SaaS land. Direct link Security Hygiene: The First Line of Security: Yup, we’re back to beating the drum for sucking less on the fundamentals like security hygiene. But the fact still remains that we don’t help ourselves by taking too long to update systems and don’t do a good enough job on configuration management. We also go through the impact and benefits of cloud and PaaS to help with these operational challenges. This one has been licensed by Oracle. Direct link Security APIs: The New Application Attack Surface: This paper covers how application architecture and attack surfaces are changing, how application security needs to evolve to deal with these disruptions, and how to empower security in environments where DevOps rules the roost. It’s licensed by Salt Security. Direct link Always happy to get feedback if there is something you like (or don’t like). Add a comment or send us an email. Share:

Read Post

New Age Network Detection: Collection and Analysis

As we return to our series on New Age Network Detection, let’s revisit our first post. We argued that we’re living through technology disruption on a scale, and at a velocity, we haven’t seen before. Unfortunately security has failed to keep pace with attackers. The industry’s response has been to move the goalposts, focusing on new shiny tech widgets every couple years. We summed it up in that first post: We have to raise the bar. What we’ve been doing isn’t good enough and hasn’t been for years. We don’t need to throw out our security data. We need to make better use of it. We’ve got to provide visibility into all of the networks (even cloud-based and encrypted ones), minimize false positives, and work through the attackers’ attempts to obfuscate their activity. We need to proactively find the attackers, not wait for them to mess up and trigger an alert. So that’s the goal – make better use of security data and proactively look for attackers. We even tipped our hat to the ATT&CK framework, which has given us a detailed map of common attacks. But now you have to do something, right? So let’s dig into what that work looks like, and we start first with the raw materials that drive security analytics – data. Collection In the olden days – you know, 2012 – life was simpler. If we wanted to capture network telemetry we’d aggregate NetFlow data from routers and switches, supplementing with full packet capture where necessary. All activity was on networks we controlled, so it wasn’t a problem to access that data. But alas, over the past decade several significant changes have shifted how that data can be collected: Faster Networks: As much as it seems enterprise data centers and networks are relics of yesteryear, many organizations still run big fast networks on-prem. So collection capabilities need to keep up. It’s not enough to capture traffic at 1gbit/sec when your data center network is running at 100gbit/sec. So you’ll need to make sure those hardware sensors have enough capacity and throughput to capture data, and in many modern architectures they’ll need to analyze it in realtime as well. Sensor Placement: You don’t only need to worry about north/south traffic – adversaries aren’t necessarily out there. At some point they’ll compromises a local device, at which point you’ll have an insider to deal with, which means you also need to pay attention to east/west (lateral) movement. You’ll need sensors, not just at key choke points for external application traffic, but also on network segments which serve internal constituencies. Public Cloud: Clearly traffic to and from internal applications is no longer entirely on networks you control. These applications now run in the public cloud, so collection needs to encompass cloud networks. You’ll need to rely on IaaS sensors, which may look like virtual devices running in your cloud networks, or you may be able to take advantage of leading cloud providers’ traffic mirroring facilities. Web/SaaS Traffic & Remote Users: Adoption of SaaS applications has exploded, along with the poppulation of remote employees, and people are now busily arguing over what an office will look like moving coming out of the pandemic. That means you might never see the traffic from a remote user to your SaaS application unless you backhaul all that traffic to a collection point you control, which is not the most efficient way to network. Collection in this context involves capturing telemetry from web security and SASE (Secure Access Service Edge) providers, who bring network security (including network detection) out to remote users. You’ll also want to rely on partnerships between your network detection vendor and application-specific telemetry sources, such as CASB and PaaS services. We should make some finer points on whether you need full packet capture or only metadata for sufficient granularity and context for detection. We don’t think there it’s an either/or proposition. Metadata provides enough depth and detail in most cases, but not all. For instance if you are looking to understand the payload of an egress session you need to full packet stream. So make sure you have the option to capture full packets, knowing you will do that sparingly. Embracing more intelligence and automation in network detection enables working off captured metadata routinely, triggering full packet collection on detection of potentially malicious activity or exfiltration. Be sure to factor in storage costs when determining the most effective collection approach. Metadata is pretty reasonable to store for long periods, but full packets are not. So you’ll want to keep a couple days or weeks of full captures around when investigating an attack, but might always save years of metadata. Another area that warrants a bit more discussion is cloud network architecture. Using a transit network to centralize inter-account and external (both ingress and egress) traffic facilitates network telemetry collection. All traffic moving between environments in your cloud (and back to the data center) runs through the transit network. But for sensitive applications you’ll want to perform targeted collection within the cloud network to pinpoint any potential compromise or application misuse. Again, though, a secure architecture which leverages isolation makes it harder for attackers to access sensitive data in the public cloud. Dealing with Encryption Another complication for broad and effective network telemetry collection is that a significant fraction of network traffic is encrypted. So you can’t access the payloads unless you crack the packets, which was much easier with early versions of SSL and TLS. You used to become a Man-in-the-Middle to users: terminating their encrypted sessions, inspecting their payloads, and then re-encrypting and sending the traffic on its way. Decryption and inspection were resource intensive but effective, especially using service chaining to leverage additional security controls (IPS, email security, DLP, etc.) depending on the result of packet inspection. But that goose has been cooked since the latest version of TLS (1.3) enlisted perfect forward secrecy to break retrospective inspection. This approach issues new keys for each encrypted

Read Post

New Age Network Detection: Introduction

Like the rest of the technology stack, the enterprise network is undergoing a huge transition. With data stores increasingly in the cloud and connectivity to SaaS providers and applications running in Infrastructure as a Service (IaaS) platforms, a likely permanently remote workforce has new networking requirements. Latency and performance continue to be important, but also being able to protect employee devices in all locations and providing access to only authorized resources. Bringing the secure network to the employee represents a better option to solve these requirements instead of forcing the employee onto the secure network. The network offers a secure connection; thus, you no longer backhaul traffic on-prem to run through the corporate web proxy or go through a centralized VPN server. And the operational challenges of running a global network forces the likely embrace of managed networking service to allow organizations to focus on what rides on top of the network and less on building and operating the pipes. Using capabilities like a software-defined perimeter (or Zero Trust Network Access, if you like that term better) and intelligent routing gets employees to the resources they need, quickly and efficiently. Pretty compelling, eh? But alas, it’ll be a long time before we fully move to this new model because installed base. Many companies still have a lot of enterprise networking gear, and the CFO said they couldn’t just toss it. Most sensitive corporate data remains on-prem, meaning we’ll still need to maintain interoperability with the data center networks for the foreseeable future. But to be clear, networks will look much different in 5 – 7 years. As exciting as these new networks may be, you can’t depend on the service provider to find adversaries in your environment. You can’t expect them to track a multi-faceted attack from the employee to the database they targeted as they pivot through various connections, compromised devices, and data stores. Even if you don’t manage the network, you need to detect and eradicate attackers, and if anything doing that across these different networks and cloud services makes it even harder. What’s the urgency? We’ve been in the security business for close to 30 years, and disruption happens slower than you expect. This Bill Gates quote sums it up nicely: “We always overestimate the change that will occur in the next two years and underestimate the change that will occur in the next ten. Don’t let yourself be lulled into inaction.” There is a lot to unpack there. What kind of actions should you be taking? Shorter term: We’re particularly guilty of overestimating progress because most of the work we do is cloud security assessment and architecture, forcing us to live in the future. Yet, the cloud still makes up a tiny percentage of total workloads. Sure it’s growing fast, probably faster than anything we’ve seen from a technology disruption standpoint. But all the same, it will be years before corporate data centers are not a thing, and we don’t need those enterprise networks anymore. So we’ve got to continue protecting the existing networks, which continue to get faster and more encrypted, complicating detection. Longer term: If we have underestimated progress over the next decade, that’s nuts since we’ve been pretty clear that how we build, deploy and manage technology will be fundamentally different over that period. If we step into the time machine and go back ten years, the progress is pretty incredible. For security, the APT was just getting started. Ransomware wasn’t a thing. AWS was in its early days, and Azure and GCP weren’t things. That means we need to ensure flexibility and agility in detecting attackers on a mostly cloud-based network, as progress doesn’t just apply to the defenders. Attackers will likewise discover and weaponize new techniques. We call the evolution to this new networking concept New Age Networking, and in this blog series, we’re going to focus on how network-based detection needs to change to keep pace. We’ll highlight what works, what doesn’t, and how you can continue protecting the old while ensuring that you are ready to secure these new cloud-based networks and services accordingly. We’d also like to tip our hat to Corelight, the potential licensee of the research when we finish. As a reminder, we write our research incrementally via blog posts and are happy to get feedback when you think we’re full of it. Once the series is complete and feedback considered, we assemble and package the blog posts into a white paper that we post to our research library. This Totally Transparent Research approach ensures we can do impactful research and make sure it gets some real-world unbiased validation. Moving the Goalposts Security suffers from the “grass is greener” phenomenon. Given that you are still dealing with attacks all day, the grass over here is brown and dingy. The tools you bought to detect this or protect that have improved your security program, but it seems like you’re in the same place. And the security industry isn’t helping. They spend a ton of marketing dollars to convince you that if you only had [latest shiny object], you’d be able to find the attackers and get home at a reasonable hour. As an industry, we constantly move the goalposts. Every two years or so, a new way of detecting attackers shows up and promises to change everything. And security professionals want to keep their environments safe, so they get caught up in the excitement. It’s way too easy to forget that the last must-have innovation didn’t have the promised impact. By then, the security industry had reset the target, resulting in constantly deploying new tools to seemingly not progress toward the mission – protecting critical corporate data. That’s not exactly fair. The goalposts do need to move to a degree because the attackers continue to innovate. If anything, standing still will cause you to fall farther behind. Our point is that chasing the new, shiny object will disappoint you over time. There is no panacea, silver bullet,

Read Post

Securing APIs: Empowering Security

As discussed in Application Architecture Disrupted, macro changes including the migration to cloud disrupting the tech stack, application design patterns bringing microservices to the forefront, and DevOps changing dev/release practices dramatically impact building and deploying applications. In this environment, the focus turns to APIs as the fabric that weaves together modern applications. Alas, the increasing importance of APIs also makes them a target. Historically, enterprises take baby steps to adopt new technologies, experimenting and finding practical boundaries to meet security, reliability, and resilience requirements before fully committing. Requiring a trade-off between security and speed, it may take years to achieve widespread usage of new technologies. But that isn’t fast enough with the expectation that today’s businesses will move fast and break stuff. As a result, DevOps organizations don’t play by the same rules governing IT adoption of new technologies. In fact, DevOps happened because corporate IT couldn’t move fast enough. These DevOps teams adopt these technologies first and ask for permission later. There needs to be a middle ground where the organization can implement security as part of the tech stack, ensuring adherence to security policies, including protecting critical data, while moving fast enough to deliver in each application sprint. The Promise of DevSecOps Getting organizations aligned to deliver secure applications has always been problematic. Incentives and metrics for development teams focus on delivering code on time and within budget. Security can impact those goals by forcing changes and delaying the shipment of new features. Even when security finds an issue and avoids a crippling data breach, it’s tough to be the bearer of bad news. So even when security is right, they are perceived to be wrong. Doesn’t DevSecOps change all that? The idea is to build security into the development and deployment processes from the start and integrate and automate security testing directly in the pipeline, so security becomes everyone’s business. In this manner, security shifts left (yes, another buzzword) and happens earlier in the development cycle. In effect, DevSecOps makes the entire system more secure, right? That’s the promise anyway. Now, let’s add another factor to increase the potential impact of DevSecOps, and that’s infrastructure as code (IaC). Everything is code in this world, not just the applications but also APIs, networks, servers, load balancers, etc. These DevSecOps concepts can apply to the entirety of the tech stack. Very exciting indeed! Yet, the reality is a little different than the promise. DevSecOps requires a genuine cultural shift forcing the traditional walls separating dev, ops, and security to fall. Many a DevSecOps initiative gets scuttled due to politics and organizational resistance to change. Of course, fighting against evolution is not a defendable position in the long term, but short-term it certainly complicates things. Finally, DevSecOps doesn’t mean security becomes an equal partner. The reality remains security issues are still issues and tend to get lumped together with other features and defects when each application sprint is defined. Security then has to fight to get the changes included in the sprint, which may or may not happen. How does this relate to API Security, since that’s what we’re talking about, right? It turns out that pretty much every modern development initiative (yes, DevOps) heavily uses APIs. Thus, securely coding and testing the APIs is an integral part of the DevSecOps process. We have to ensure developers both have the proper training and a means to ensure there aren’t issues with the API definitions as the code moves its way through the pipeline. There’s No Time Like Runtime Let’s assume that your DevSecOps initiative goes swimmingly. The DevOps teams get it and have instrumented the CI/CD pipeline to ensure API security policies are tested and enforced before any code deployment. But that’s only half the battle. The deployed code is still at risk for manipulation, misuse, and business logic errors that automated tests won’t necessarily catch in the pipeline. What then? The other half is runtime security, dealing with misuse, drift, human error, or any other issue that violates application (or API) security policies after the code deployment. This requires runtime monitoring to detect potential issues. This API and application security monitoring looks an awful lot like other monitoring techniques. You start by collecting and aggregating data about application/API usage and then watching for signs of misuse. You can (and should) look for clear attack patterns (Indicators of Compromise and Attack), as well as using advanced analytics (machine learning) to see if the application usage varies from a typical usage baseline, potentially indicating malicious intent. So what happens upon discovering a security issue? Who is responsible for fixing it? Is it Ops? Does the developer have to update the code in the template immediately? Security’s role (or lack thereof) in fixing security issues can cause a lot of frustration amongst security folks, especially when the Ops team doesn’t perceive the same level of urgency to address the issue. As we’ve described, DevOps happened because IT wasn’t responsive enough to the business, so the DevOps team certainly doesn’t want to go back to the old ways of waiting for someone in security to get around to fixing their stuff. Additionally, security will bring a contextual perspective that Dev and Ops will miss because they aren’t immersed in security all day, every day. So it works much better when security and DevOps can work together to address these runtime issues. Where is the middle ground? It’s a concept that we call guardrails, which are the security policies that the organization cannot violate. We’ve taken to calling them a very technical term – no-no’s – since these are the things that should never happen in a production environment. In the event a guardrail trips, the security team is empowered and expected to fix the issue. Everything else would go into the queue of issues/defects to address in due course by Dev or Ops during a regular sprint. Defining the no-no’s requires careful consideration since it represents a take action now, ask questions later

Read Post

Securing APIs: Modern API Security

As we started the API Security series, we went through how application architecture evolves and how that’s changing the application attack surface. API Security requires more than traditional application security. Traditional application security tactics like SAST/DAST, WAF, API Gateway, and others are necessary but not sufficient. We need to build on top of the existing structures of application security to protect modern applications. So what does API Security look like? We wouldn’t be analysts if we didn’t think in terms of process and lifecycle. Having practiced security for decades, one of the only truisms which held up over time has been visibility, then control. There are a hundred ways to describe it, like “you can’t manage what you can’t see,” and they are right. Let’s use that prism to look at API security, and that means starting with visibility. API Visibility The key to any security visibility effort is to figure out what data is needed and then where you can get it. First, start with the APIs you know about and are documented. That leads you to the OpenAPI specifications, which provide details on the operations the API supports, the API parameters and functions, authentication and authorization requirements, and assorted other information relevant to API usage. With the documented specifications, you can figure out what the API does and identify potential security issues. Although the reality is developers probably haven’t documented all of the APIs in use. They’re busy shipping code, don’t you know? Kidding aside, you can make the case about how building the documentation as the API is defined is the right way to do things, but that may not happen under the stress of a deadline. So we’ll want to look for other places to identify API usage. API Gateways: We can and will continue to debate the security usefulness of API gateways, but they provide a central point to manage the performance and authentication of existing APIs, routing requests to the appropriate destinations. That means these gateways see API traffic and provide more data about the use of APIs in the environment. Application Scanning: Though not the most efficient means of discovering APIs, you can scan each application to enumerate available APIs and determine which interfaces are open and potentially exposed. Passive Monitoring: Finally, you can also look at the traffic on the network to identify and enumerate API usage based on the data you see flying by. A similar technique monitors networks to identify endpoints and even do vulnerability scanning without requiring agents. Once you’ve found the APIs, you should ensure that data exposed via the APIs do not violate any security policies. Providing a similar function to data leak prevention (DLP), this capability identifies common private data types (SSN, Account IDs, other PII) and looks for proprietary data exposed via the APIs. Detection is the first step. You’ll need to figure out the proper operational motion once sensitive data may be accessible via the API. Who gets a notification, and under what circumstances would you block an API response? Although that’s getting a bit ahead of ourselves. At this point, figuring out potential data exposure remains the priority. Once you have a handle on the APIs in use and any sensitive data accessible via the APIs, we recommend building and maintaining a comprehensive API inventory. Any new or changed API can be compared to the inventory to quickly determine what’s changed and whether it adheres to security policies. Moreover, this is useful to keep track of the API attack surface to ensure adequate protection. Now speaking of protecting APIs… Securing the APIs Protection starts with an understanding of the threats that you face. We went through some of those attacks in the last post, but selecting the right protection depends on understanding the threat model. For API security, you protect against two main threats: attacks and misuse. Attacks being what you’d see in the OWASP API Top 10. Misuse is attempting to access sensitive data, impact the API’s availability, or steal credentials and take over accounts. You can search on “OWASP top API attacks” to access some sites with detailed descriptions of the OWASP Top 10 attacks along with mitigation techniques, so we’ll focus on the capabilities you need to protect against all of the attacks. API Scanning: The first step in protecting the API is to make sure it doesn’t have issues in the definition. Basically, a static API scanning capability that looks for weak authentication and loose parameter, response, or payload definitions, etc. This scanning capability should reflect the organization’s security policies and trigger automatically within the DevOps pipeline during deployment. Analogous to application security scanning, these API scans can be either static (looking at the API code) or dynamic (sending incorrect data to the API to catch incorrect behavior). Detection and Blocking: If it looks like an attack, it’s probably an attack, and an API security solution needs to be able to detect and block attacks like those enumerated in the OWASP API security list. You’ll also want the API security solution to explicitly enforces the parameters set in the API contract to ensure authorized use. Anomaly Detection: Shocking as it is, new analytics driving better detection of attacks uses the same approach as network anomaly detection devices that appeared 20 years ago. Improved math/analytics mean better baselines that allow an API security solution to define normal API level down to the user or process level, enabling the detection of obfuscated, slow and subtle attacks and discerning innocent activity from malicious intent. As APIs change, it’s essential to keep the baseline current to maintain this context. As you consider an API security solution, you’ll get pulled into an age-old question of inline (requiring an agent implemented within each micro-service or container) or out of band (monitoring the infrastructure for API activity). Inline solutions enforce the policies and block attacks directly as they deploy within the application’s data path. On the other hand, you need to install the code within each

Read Post

Securing APIs: Application Architecture Disrupted

When you think of disruption, the typical image is a tornado coming through and ripping things up, leaving towns leveled and nothing the same moving forward. But disruption can be slow and steady, incremental in the way everything you thought you knew has changed. Securing cloud environments was like that, initially trying to use existing security concepts and controls, which worked well enough. Until they didn’t and forced a re-evaluation of everything that we thought we knew about security. The changes were (and still are for many) challenging, but overall very positive. We see the same type of disruption in how applications are built, deployed, and maintained within most organizations. Macro changes include the migration to cloud disrupting the tech stack, application design patterns bringing microservices to the forefront, and DevOps changing dev/release practices. As we’ve been slowly navigating this sea change, the common thread between these changes is an increasing reliance on application programming interfaces (APIs). From a security standpoint, this new dependence on APIs changes the source of risk – it’s not just the front end under siege from traditional attacks and recon activities that map out backend processes. APIs have quickly emerged as the most attractive and least protected target within these new applications since they have access to critical data and services. Thus, we’ve decided to document this disruption and the impact on how you have to view application security moving forward. We’re happy to introduce our latest blog series called Securing APIs: The New Application Attack Surface. In the series, we’ll go through how application architecture and the attack surface is changing, how application security needs to evolve to deal with these disruptions, and how to empower security in an environment where DevOps rules the roost. Because that is the way. Let’s give thanks to Salt Security as the potential licensee of this blog series before we get started. As a refresher for those new around here, we don’t write sponsored papers. We publish research for practitioners that we may license to a vendor at the end of the process. That gives us the flexibility to go where our research takes us without undue influence. It’s a bit of a counter-intuitive model, but we’ve been doing it for 13 years at this point, and it works pretty well. Application Architecture Today As we get started, let’s go through how we see application architecture evolving. There isn’t one size fits all regarding architecture, and not all of these aspects may apply to your situation. But we’re pretty sure they will; it’s just a matter of time. Smaller: First, let’s highlight microservices. This approach breaks down traditional monolithic applications into a set of services weaved together through defined APIs. This approach adds modularity (yes, we used to call this reusability of components), flexibility and consistency since your developers don’t need to reinvent the wheel. It’s also heavily dependent on open source components that provide the base for many services. Faster: With the embrace of DevOps in many application teams, the objective is to eliminate the typical walls between Development and Ops (and security to a point), which creates shared accountability and focuses everyone on not just building but deploying and operating applications at higher velocity with better resilience. A key to making DevOps work is adding automation to manage the deployment process. The automation spans from code check-in, to testing (including security tests), and ultimately the deployment into production. How do you address the CI/CD (continuous integration/continuous deployment) pipeline and all the ancillary services orchestrated by the pipeline? Yup. Through APIs. Cloud-Native: The computing platform where the applications run also has evolved significantly. Given the requirements (as described above) of modularity, flexibility, and velocity, applications need to run in a more agile infrastructure. It may be public cloud, containerized, serverless, or some combination therein. When we say cloud-native, it can encompass any of the three, not just containers. But regardless, you interact with the computing platform via (drum roll, please) APIs. And increasingly, your infrastructure is described as code, which increases the application surface for security testing. Another hallmark of modern application architecture is assembling applications, as opposed to building them. Using pre-built microservices to get started and building the components you need allows you to weave the application together without developing everything. This democratizes technology and allows business professionals to play a more prominent role in building the applications they need, potentially without IT interference. That’s a bit harsh but not too far from the truth. And to reiterate, what do microservices, DevOps, and the cloud have in common? A reliance on APIs to integrate the components of the application stack. And that makes APIs a pretty sweet target for attackers. API Attack Surface As with most things security, protection starts with visibility. You’ve got options to enumerate the API environment, leveraging an inventory (such as a Swagger file repository) or via the discovery of APIs via scanning and network monitoring. But visibility isn’t only for you. The problem is attackers can also use the same techniques to enumerate your API surface. Especially given that API requests and responses may travel over accessible networks, and Swagger files are accessible, providing an opportunity for an attacker to discover API parameters and potentially gain access to application data. We’ll go into a more detailed discussion of API visibility/discovery in the next post. API Attacks OWASP has done an excellent job of documenting standard API attacks in their list of OWASP API Top 10. These range from the simple, like randomly changing resource IDs to access other customer’s data (insecure direct object reference) or brute force attacks to identify weak links in API authentication. There are input attacks meant to cause API failures or traditional flaws like buffer overflows. More complicated attacks involve gaming the application’s permission structure by invoking admin-level APIs without proper authorization or authentication. You can also see application defects like excessive data exposure when the application unnecessarily returns full data objects. Finally, you can have

Read Post

Infrastructure Hygiene: Success and Consistency

We went through the risks and challenges of infrastructure hygiene, and then various approaches for fixing the vulnerabilities. Let’s wrap up the series by seeing how this kind of approach works in practice and how we’ll organize to ensure the consistent and successful execution of an infrastructure patch. Before we dive in, we should reiterate that none of the approaches we’ve offered are mutually exclusive. A patch does eliminate the vulnerability on the component, but the most expedient path to reduce the risk might be a virtual patch. The best long-term solution may involve moving the data layer to a PaaS service. You figure out the best approach on a case-by-case basis, balancing risk, availability, and the willingness to consider refactoring the application. Quick Win High-priority vulnerabilities happen all the time, and how you deal with it typically determined the perceived capability/competence of the security team. In this scenario, we’ve got a small financial organization, maybe a regional bank. They have a legacy client/server application handling customer loan data that uses stored procedures heavily for back-end processing. The application team added a front-end web interface in 2008, but it’s been in maintenance mode since then. We know 1998 called and wants their application back. Still, all the same, when a vendor alert informs the team of a high-profile vulnerability impacting the back-end database, the security team must address the issue. The first step in our process is risk analysis. Based on a quick analysis of threat intelligence, there is an exploit in the wild, which means doing nothing is not an option. And with the exploit available, time is critical. Next, you need a sense of the application’s importance, described above as having customer loan data, so clearly, it’s essential to the business. Since application usage typically occurs during business hours, a patch can happen after hours. The strategic direction is to migrate the application to the cloud, but that will take a while, so it’s not anything to figure into this analysis. Next, look at short-term mitigation, needed because the exploit is used in the wild, and the database is somewhat accessible via the web front end. The security team deploys a virtual patch on the perimeter IPS device, which provides a means of mitigating the attack. As another precaution, the team decides to increase monitoring around the database to ensure that no insider activity is detected that would evade the virtual patch. The operations team then needs to apply the patch during the next maintenance window. Given the severity of the exploit and the data’s value, you’d typically need to do a high-priority patch. But the virtual patch bought the team some time to test the patch to make sure it doesn’t impact the application. The patch test showed no adverse impact, so operations successfully applied it during the next maintenance window. The last step involves a strategic review of the process to see if anything should be done differently and better next time. The application is slated to be refactored and moved into the bank’s cloud tenant, but not for 24 months. Does it make sense to increase the priority? Probably not; even if the next vulnerability doesn’t lend itself to a virtual patch, an off-hours emergency update could be done without a significant impact on application availability. As refactoring the application begins, it will make sense to look at moving some of the stored procedures to an app server tier and migrating the data later to PaaS to reduce both the application’s attack and operational surface. Organization Alignment The scenario showed how all of the options for infrastructure hygiene could play together to mitigate the risk of a high-priority database vulnerability effectively. Several teams were involved in the process, starting with security that identified the issue, worked through the remediation alternatives, and deployed the virtual patch and additional monitoring capabilities. The IT Ops team played an essential role in managing the testing and application of the database patch. The architecture team weighed in at the end about migrating and refactoring the application in light of the vulnerability. For a process to work consistently, all of these teams need to be aligned and collaborating to ensure the desired outcome – application availability. However, we should mention another group that plays a crucial role in facilitating the process – the Finance team. Finance pays for things like a perimeter device that deploys the virtual patch, as well as a support/maintenance agreement to ensure access to patches, especially for easily forgotten legacy applications. As critical as technical skills remain to keep the infrastructure in top shape, ensuring the technical folks have the resources to do their jobs is just as important. With that, let’s put a bow on the Infrastructure Hygiene series. We’ll be continuing to gather feedback on the research over the next week or so, and then we’ll package it up as a paper. Thanks again to Oracle for potentially licensing the content, and keep an eye out for an upcoming webcast on the topic. Share:

Read Post

Infrastructure Hygiene: Fixing Vulnerabilities

As discussed in the first post in the Infrastructure Hygiene series, the most basic advice we can give on security is to do the fundamentals well. That doesn’t insulate you from determined and well-funded adversaries or space alien cyber attacks, but it will eliminate the path of least resistance that most attackers take. The blurring of infrastructure as more tech stack components become a mix of on-prem, cloud-based, and managed services further complicate matters. How do you block and tackle well when you have to worry about three different fields and multiple teams playing on each field? Maybe that’s enough of the football analogies. As if that wasn’t enough, now you have no margin for error because attackers have automated the recon for many attacks. So if you leave something exposed, they will find it. They being the bots and scripts always searching the Intertubes for weak links. Although you aren’t reading this to keep hearing about the challenges of doing security, are you? So let’s focus on how to fix these issues. Fix It Fast and Completely It may be surprising, but the infrastructure vendors typically issue updates when discovering vulnerabilities in their products. Customers of those products then patch the devices to keep them up to date. We’ve been patching as an industry for a long time. And we at Securosis have been researching patching for almost as long. Feel free to jump in the time machine and check out our seminal work on patching in the original Project Quant. The picture above shows the detailed patching process we defined back in the day. You need to have a reliable, consistent process to patch the infrastructure effectively. We’ll point specifically to the importance of the test and approve step due to the severity of the downside of deploying a patch that takes down an infrastructure component. Yet going through a robust patching process can take anywhere from a couple of days to a month. Many larger enterprises look to have their patches deployed within a month of release. But in reality, a few weeks may be far too long for a high-profile patch/issue. As such, you’ll need a high priority patching process, which applies to patches addressing very high-risk vulnerabilities. Part of this process is to establish criteria for triggering the high-priority patching process and which parts of the long process you won’t do. Alternatively, you could look at a virtual patch=, which is an alternative approach to use (typically) a network security device to block traffic to the vulnerable component based on the attack’s signature. This requires that the attack has an identifiable pattern to build the signature. On the positive, a virtual patch is rapid to deploy and reasonably reliable for attacks with a definite traffic pattern. One of the downsides of this approach is that all traffic destined for the vulnerable component would need to run through the inspection point. If traffic can get directly to the component, the virtual patch is useless. For instance, if a virtual patch was deployed on a perimeter security device to protect a database, an insider with direct access to the database could use the exploit successfully since the patch hasn’t been applied. In this context, insider could also mean an adversary with control of a device within the perimeter. For high-priority vulnerabilities, where you cannot patch either because the patch isn’t available or due to downtime or other maintenance challenges, a virtual patch provides a good short-term alternative. But we’ll make the point again that you aren’t fixing the component, rather hiding it. And with 30 years of experience under our belts, we can definitively tell you that security by obscurity is not a path to success. We don’t believe that these solutions are mutually exclusive. The most secure way to handle infrastructure hygiene is to use both techniques. Virtual patching can happen almost instantaneously, and when dealing with a new attack with a weaponized exploit already in circulation, time is critical. But given the ease with which the adversary can change a network signature and the reality that it’s increasingly hard to ensure that all traffic goes through an inspection point, deploying a vendor patch is the preferred long-term solution—and speaking of long-term solutions. Abuse the Shared Responsibilities Model One of the things about the cloud revolution that is so compelling is the idea of replacing some infrastructure components with platform services (PaaS). We alluded to this in the first post, so let’s dig a bit deeper into how the shared responsibility model can favorably impact your infrastructure hygiene. Firstly, the shared responsibility model is a foundational part of cloud computing and defines that the cloud provider has specific responsibilities. The cloud consumer (you) would also have security responsibilities. Ergo, it’s a shared responsibility situation. Divvying up the division of responsibilities depends on the service and the delivery model (SaaS or PaaS), but suffice it to say that embracing a PaaS service for an infrastructure component gets you out of the operations business. You don’t need to worry about scaling or maintenance, and that includes security patches. I’m sure you’ll miss the long nights and weekends away from your family running hotfixes on load balancers and databases. Ultimately moving some of the responsibility to a service provider reduces both your attack and your operational surfaces, and that’s a good thing. Long term, strategically using PaaS services will be one of the better ways to reduce your technology stack risk. Though let’s be very clear using PaaS doesn’t shift accountability. Your PaaS provider may feel bad if they mess something up and will likely refund some of your fees if they violate their service level agreement. But they won’t be presenting to your board explaining how the situation got screwed up – that would be you. The Supply Chain If there is anything we’ve learned from the recent Solarwinds and the Target attack from years ago (both mentioned in the first post of the series), it’s that your

Read Post

Totally Transparent Research is the embodiment of how we work at Securosis. It’s our core operating philosophy, our research policy, and a specific process. We initially developed it to help maintain objectivity while producing licensed research, but its benefits extend to all aspects of our business.

Going beyond Open Source Research, and a far cry from the traditional syndicated research model, we think it’s the best way to produce independent, objective, quality research.

Here’s how it works:

  • Content is developed ‘live’ on the blog. Primary research is generally released in pieces, as a series of posts, so we can digest and integrate feedback, making the end results much stronger than traditional “ivory tower” research.
  • Comments are enabled for posts. All comments are kept except for spam, personal insults of a clearly inflammatory nature, and completely off-topic content that distracts from the discussion. We welcome comments critical of the work, even if somewhat insulting to the authors. Really.
  • Anyone can comment, and no registration is required. Vendors or consultants with a relevant product or offering must properly identify themselves. While their comments won’t be deleted, the writer/moderator will “call out”, identify, and possibly ridicule vendors who fail to do so.
  • Vendors considering licensing the content are welcome to provide feedback, but it must be posted in the comments - just like everyone else. There is no back channel influence on the research findings or posts.
    Analysts must reply to comments and defend the research position, or agree to modify the content.
  • At the end of the post series, the analyst compiles the posts into a paper, presentation, or other delivery vehicle. Public comments/input factors into the research, where appropriate.
  • If the research is distributed as a paper, significant commenters/contributors are acknowledged in the opening of the report. If they did not post their real names, handles used for comments are listed. Commenters do not retain any rights to the report, but their contributions will be recognized.
  • All primary research will be released under a Creative Commons license. The current license is Non-Commercial, Attribution. The analyst, at their discretion, may add a Derivative Works or Share Alike condition.
  • Securosis primary research does not discuss specific vendors or specific products/offerings, unless used to provide context, contrast or to make a point (which is very very rare).
    Although quotes from published primary research (and published primary research only) may be used in press releases, said quotes may never mention a specific vendor, even if the vendor is mentioned in the source report. Securosis must approve any quote to appear in any vendor marketing collateral.
  • Final primary research will be posted on the blog with open comments.
  • Research will be updated periodically to reflect market realities, based on the discretion of the primary analyst. Updated research will be dated and given a version number.
    For research that cannot be developed using this model, such as complex principles or models that are unsuited for a series of blog posts, the content will be chunked up and posted at or before release of the paper to solicit public feedback, and provide an open venue for comments and criticisms.
  • In rare cases Securosis may write papers outside of the primary research agenda, but only if the end result can be non-biased and valuable to the user community to supplement industry-wide efforts or advances. A “Radically Transparent Research” process will be followed in developing these papers, where absolutely all materials are public at all stages of development, including communications (email, call notes).
    Only the free primary research released on our site can be licensed. We will not accept licensing fees on research we charge users to access.
  • All licensed research will be clearly labeled with the licensees. No licensed research will be released without indicating the sources of licensing fees. Again, there will be no back channel influence. We’re open and transparent about our revenue sources.

In essence, we develop all of our research out in the open, and not only seek public comments, but keep those comments indefinitely as a record of the research creation process. If you believe we are biased or not doing our homework, you can call us out on it and it will be there in the record. Our philosophy involves cracking open the research process, and using our readers to eliminate bias and enhance the quality of the work.

On the back end, here’s how we handle this approach with licensees:

  • Licensees may propose paper topics. The topic may be accepted if it is consistent with the Securosis research agenda and goals, but only if it can be covered without bias and will be valuable to the end user community.
  • Analysts produce research according to their own research agendas, and may offer licensing under the same objectivity requirements.
  • The potential licensee will be provided an outline of our research positions and the potential research product so they can determine if it is likely to meet their objectives.
  • Once the licensee agrees, development of the primary research content begins, following the Totally Transparent Research process as outlined above. At this point, there is no money exchanged.
  • Upon completion of the paper, the licensee will receive a release candidate to determine whether the final result still meets their needs.
  • If the content does not meet their needs, the licensee is not required to pay, and the research will be released without licensing or with alternate licensees.
  • Licensees may host and reuse the content for the length of the license (typically one year). This includes placing the content behind a registration process, posting on white paper networks, or translation into other languages. The research will always be hosted at Securosis for free without registration.

Here is the language we currently place in our research project agreements:

Content will be created independently of LICENSEE with no obligations for payment. Once content is complete, LICENSEE will have a 3 day review period to determine if the content meets corporate objectives. If the content is unsuitable, LICENSEE will not be obligated for any payment and Securosis is free to distribute the whitepaper without branding or with alternate licensees, and will not complete any associated webcasts for the declining LICENSEE. Content licensing, webcasts and payment are contingent on the content being acceptable to LICENSEE. This maintains objectivity while limiting the risk to LICENSEE. Securosis maintains all rights to the content and to include Securosis branding in addition to any licensee branding.

Even this process itself is open to criticism. If you have questions or comments, you can email us or comment on the blog.