Defending Against DDoS: MitigationsBy Mike Rothman
Our past two posts discussed network-based Distributed Denial of Device (DDoS) attacks and the tactics used to magnify those attacks to unprecedented scale and volume. Now it’s time to wrap up this series with a discussion of defenses. To understand what you’re up against let’s take a small excerpt from our Defending Against Denial of Service Attacks paper.
First the obvious: you cannot just throw bandwidth at the problem. Your adversaries likely have an unbounded number of bots at their disposal and are getting smarter at using shared virtual servers and cloud instances to magnify the amount at their disposal. So you can’t just hunker down and ride it out. They likely have a bigger cannon than you can handle. You need to figure out how to deal with a massive amount of traffic, and separate good traffic from bad while maintaining availability.
Your first option is to leverage existing network/security products to address the issue. As we discussed in our introduction, that is not a good strategy because those devices aren’t built to withstand the volumes or tactics involved in a DDoS. Next, you could deploy a purpose-built device on your network to block DDoS traffic before it melts your networks. This is certainly an option, but if your inbound network pipes are saturated, an on-premise device cannot help much – applications will still be unavailable. Finally, you can front-end your networks with a service to scrub traffic before it reaches your network. But this approach is no panacea either – it takes time to move traffic to a scrubbing provider, and during that window you are effectively down.
So the answer is likely a combination of these tactics, deployed in a complimentary fashion to give you the best chance to maintain availability.
Before we dig into the different alternatives, we need to acknowledge one other choice: doing nothing. The fact is that many organizations have to go through an exercise after being hit by a DDoS attack, to determine what protections are needed. Given the investment required for any of the alternatives listed above, you have to weigh the cost of downtime against the cost of potentially stopping the attack.
This is another security tradeoff. If you are a frequent or high-profile target then doing nothing isn’t an option. If you got hit with a random attack – which happens when attackers are testing new tactics and code – and you have no reason to believe you will be targeted again, you may be able to get away with doing nothing. Of course you could be wrong, in which case you will suffer more downtime. You need to both make sure all the relevant parties are aware of this choice, and manage expectations so they understand the risk you are accepting in case you do get attacked again.
We will just say we don’t advocate this do-nothing approach, but we do understand that tough decision need to be made with scarce resources. Assuming you want to put some defenses in place to mitigate the impact of a DDoS, let’s work through the alternatives.
DDoS Defense Devices
These appliances are purpose-built to deal with DoS attacks, and include both optimized IPS-like rules to prevent floods and other network anomalies, and simple web application firewall capabilities to protect against application layer attacks. Additionally, they feature anti-DoS features such as session scalability and embedded IP reputation capabilities, in order to discard traffic from known bots without full inspection.
To understand the role of IP reputation, let’s recall how email connection management devices enabled anti-spam gateways to scale up to handle spam floods. It is computationally expensive to fully inspect every inbound email, so immediately dumping messages from known bad senders focuses inspection on email that might be legitimate to keep mail flowing. The same concept applies here. Keep the latency inherent in checking a cloud-based reputation database in mind – you will want the device to aggressively cache bad IPs to avoid a lengthy cloud lookup for every incoming session.
For kosher connections which pass the reputation test, these devices additionally enforce limits on inbound connections, govern the rate of application requests, control clients’ request rates, and manage the number of total connections allowed to hit the server or load balancer sitting behind it. Of course these limits must be defined incrementally to avoid shutting down legitimate traffic during peak usage.
Speed is the name of the game for DDoS defense devices, so make sure yours have sufficient headroom to handle your network pipe. Over-provision to ensure they can handle bursts and keep up with the increasing bandwidth you are sure to bring in over time.
CDN/Web Protection Services
Another popular option is to front-end web applications with a content delivery network or web protection service. This tactic only protects the web applications you route through the CDN, but can scale to handle very large DDoS attacks in a cost-effective manner. Though if the attacker is targeting other address or ports on your network, you’re out of luck – they aren’t protected. DNS servers, for instance, aren’t protected.
We find CDNs effective for handling network-based DDOS in smaller environments with a small external web presence. There are plenty of other benefits to a CDN, including caching and shielding your external IP addresses. But for stopping DDoS attacks a CDN is a limited answer.
The next level up the sophistication (and cost) scale is an external scrubbing center. These services allow you to redirect all your traffic through their network when you are attacked. The switch-over tends to be based on either a proprietary switching protocol (if your perimeter devices or DDoS Defense appliances support the carrier’s signaling protocol) or a BGP request. Once the determination has been made to move traffic to the scrubbing center, there will be a delay while the network converges, before you start receiving clean traffic through a tunnel from the scrubbing center.
The biggest question with a scrubbing center is when to move the traffic. Do it too soon and your resources stay up, but at significant cost. Do it too late and you can suffer additional downtime. Finding that balance is a company-specific decision based on the perceived cost of downtime, compared to the cost and valuable of the service.
Another blind spot for scrubbing is hit and run attacks, when an attacker blasts a site for briefly to take it down. Once the victim moves the traffic over to a scrubbing center, the attacker stops, not even trying to take down a scrubber. But the attack has already achieved its goals: disrupted availability and increased latency.
These factors have pushed scrubbing centers to advocate for an always on approach, where the customer runs all traffic through the scrubbing center all the time. Obviously there is a cost but if you are a frequent DDoS target or cannot afford downtime for any reason, it may be worth it.
All of the above
As we stated in Defending Against DoS attacks, the best answer is often all the above. Your choice of network-based DoS mitigations inevitably involves trade-offs. It is not good to over-generalize, but most organizations are best suited by a hybrid approach, involving both an on-premise appliance and a contract with a CDN or anti-DoS service provider to handle more severe volumetric attacks. It is rarely cost-effective to run all traffic through a scrubbing center constantly, and many DoS attacks target the application layer – in which case you need a customer premise device anyway.
Other Protection Tactics
Given that many DDoS attacks also target DNS (as described in the Attacks post), you will want to make sure your internal DNS infrastructure is protected by front-ending your DNS servers with a DDoS defense device. You will also want some due diligence on your external DNS provider to ensure they have sufficient protections against DDoS, as they will be targeted along with you, and you could be impacted if they fall over.
You don’t want to contribute to the problem yourself, so as a matter of course you should make sure you aren’t responding to public NTP requests on public NTP servers (as described by US-CERT). You will want to remediate compromised devices as quickly as practical for many reasons, not least to ensure they don’t blast others with your resources and bandwidth.
The Response Process
A strong underlying process is your best defense against a DDoS attack. Tactics change as attack volumes increase, but if you don’t know what to do when your site goes down, it will be out for a while.
The good news is that the DoS defense process is quite similar to general incident response. We have already published a ton of research on this topic, so check out both our Incident Response Fundamentals series and our React Faster and Better paper. If your incident handling process isn’t where it needs to be yet, start there.
Building off your existing IR process, think about what you need to do as a set of activities: before, during, and after an attack:
- Before: Before an attack, spend time figuring out attack indicators and ensuring you perform sufficient monitoring to provide both adequate warning and enough information to identify the root cause of attacks. You might see increasing bandwidth volumes or a spike in DNS traffic. Perhaps your applications get flaky and fall down, you see server performance issues, or your CDN alerts you to a possible attack. Unfortunately many DDoS attacks come out of nowhere, so you may not know you are under attack until you are down.
- During: How can you restore service as quickly as possible? By identifying the root cause accurately and remediating effectively. So you need to notify the powers that be, assemble your team, and establish responsibilities and accountability. Then focus on identifying root cause, attack vectors, and adversaries to figure out the best way to get the site back up. Restoring service depends on the mitigations in place, discussed above. Optimally your contracted CDN and/or anti-DoS service provider already has a team working on the problem by this point. In case you don’t have one, you can hope the attack doesn’t last long or your ISP can help you. Good luck.
- After: Once the attack has been contained focus shifts to restoring normal operations, moving traffic back from the scrubbing center, and perhaps loosening anti-DoS/WAF rules. Keep monitoring for trouble. Try to make sure this doesn’t happen again. This involves asking questions… What worked? What didn’t? Who needs to be added to the team? Who just got in the way? This analysis needs to objectively identify the good, the bad, and the ugly. Dig into the attack as well. What controls would have blunted its impact? Would running all your traffic through a scrubbing provider have helped? Did network redirection work quickly enough? Did you get the right level of support from your service provider? Then update your process as needed and implement new controls if necessary.
As we wrap up this series on network-based DDoS, let’s revisit a few key points.
- Today’s DoS attacks encompass network attacks, application attacks, and magnification techniques to confuse defenders and exploit weaknesses in defenses.
- Organizations need a multi-faceted approach to defend against DDoS, which likely involves both deploying DDoS defense equipment on-site and contracting with a service provider (either a scrubbing center or a content delivery network) to handle excessive traffic.
- DoS mitigations do not work in isolation – on-premise devices and services are interdependent for adequate protection, and should communicate with each other to ensure an efficient and transparent transition to the scrubbing service when necessary.
Of course there are trade-offs with DDoS defense, as with everything. Selecting an optimal mix of defensive tactics requires some adversary analysis, an honest and objective assessment of just how much downtime is survivable, and what you are willing to pay to restore service quickly. If a few hours of downtime are survivable defensive tactics can be much different than in situations where no downtime is ever acceptable – which demands more expenditure and much more sophisticated defenses.