Defending Against DoS Attacks: the Process
As we have mentioned throughout this series, a strong underlying process is your best defense against a Denial of Service (DoS) attack. Tactics change and the attack volumes increase, but if you don’t know what to do when your site goes down it will be down for a while. The good news is the DoS Defense process is a close relative to your general incident response process. We have already done a ton of research on the topic, so check out both our Incident Response Fundamentals series and our React Faster and Better paper. If your incident handling process isn’t where it needs to be, you should start there. Building off the IR process, think about what you need to do as a set of activities before, during, and after the attack: Before: Before an attack you spend time figuring out the triggers for an attack, and ensuring you perform persistent monitoring to ensure you have both sufficient warning and enough information to identify the root cause of the attack. This must happen before the attack, because you only get one chance to collect that data, while things are happening. In Before the Attack we defined a three step process for these activities: define, discover/baseline, and monitor. During: How can you contain the damage as quickly as possible? By identifying the root cause accurately and remediating effectively. This involves identifying the attack (Trigger and Escalate), identifying and mobilizing the response team (Size up), and then containing the damage in the heat of battle. During the Attack summarizes these steps. After: Once the attack has been contained focus shifts to restoring normal operations (Mop up) and making sure it doesn’t happen again (Investigation and Analysis). This involves a forensics process and some self-introspection described in After the Attack. But there are key differences when dealing with DoS so let’s amend the process a bit. We have already talked about what needs to happen before the attack, in terms of controls and architectures to maintain availability in the face of DoS attacks. That may involve network-based approaches, or focusing on the application layer – or more likely both. Before we jump into what needs to happen during the attack, let’s mention the importance of practice. You practice your disaster recovery plan, right? You should practice your incident response plan, and even a subset of that practice for DoS attacks. The time to discover the gaping holes in your process is not when the site is melting under a volumetric attack. That doesn’t mean to npblast yourself with 80gps of traffic either. But practice handoffs with the service provider, tuning the anti-DoS gear, and ensuring everyone knows their roles and accountability for the real thing. Trigger and Escalate There are a number of ways you can detect a DoS attack in progress. You could see increasing volumes or a spike in DNS traffic. Perhaps your applications get a bit flaky and fall down, or you see server performance issues. You might get lucky and have your CDN alert you to the attack (you set the CDN to alert on anomalous volumes, right?). Or more likely you’ll just lose your site. Increasingly these attacks tend to come out of nowhere in a synchronized series of activities targeting your network, DNS, and applications. We are big fans of setting thresholds and monitoring everything, but DoS is a bit different in that you may not see it coming despite your best efforts. Size up Now your site and/or servers are down, and all hell is likely breaking loose. So now you need to notify the powers that be, assemble the team, and establish responsibilities and accountabilities. You will also have your guys starting to dig into the attack. They’ll need to identify root cause, attack vectors, and adversaries, and figure out the best way to get the site back up. Restore There is considerable variability in what comes next. It depends on what network and application mitigations are in place. Optimally your contracted CDN and/or anti-DoS service provider already has a team working on the problem. If it’s an application attack, with a little tuning hopefully your anti-DoS appliance can block the attacks. Hope isn’t a strategy so you need plan B, which usually entails redirecting your traffic to a scrubbing center as we described in Network Defenses. The biggest decision you’ll face is when to actually redirect the traffic. If the site is totally down that decision is easy. If it’s an application performance issue (caused by an application or network attack), you need more information – particularly an idea of whether or not the redirection will even help. In many cases it will, since the service provider will then see the traffic and they likely have more expertise and can more effectively diagnose the issue, but there will be a lag as the network converges after changes. Finally, there is the issue of targeted organizations without contracts with a scrubbing center. In that case, your best bet is to cold call an anti-DoS provider and hope they can help you. These folks are in the business of fighting DoS, so they will likely be able to help, but do you want to take a chance on that? We don’t, so it makes sense to at least have a conversation with an anti-DoS provider before you are attacked – if only to understand their process and how they can help. Talking to a service provider doesn’t mean you need to contract for their service. It means you know who to call and what to do under fire. Mop up You have weathered the storm and your sites operate normally now. In terms of mopping up, you’ll shunt traffic from the scrubbing center and perhaps loosen up the anti-DoS appliance/WAF rules. You will keep monitoring for more signs of trouble, and probably want to grab a couple days sleep to catch up. Investigate and Analyze Once you are well rested, don’t fall into the trap of