A few hours after this post goes live, the Verizon Enterprise risk team will release their 2013 Data Breach Investigations Report. This is a watershed year for the report, as they are now up to 19 contributing organizations including law enforcement agencies, multiple emergency response teams (CERTs), and even potential competitors. The report covers 47,000 incidents, among which there were 621 confirmed data disclosures. This is the best data set since the start of the report, so it provides the best insight into what is going on out there.

We were fortunate enough to get a preview of the report and permission to post this a few hours before the report is released. In the next 24-72 hours you will see a ton of articles; as analysts we aren’t here to make a story or nab a headline, but to help you get your job done. We offer a very brief overview of the interesting things we saw in the report, but our main focus for this post is to save you a little time in using the results to improve security in your own organization.

The best part this year is that the data reflects a more balanced demographic than in the past, and the Verizon risk team did a great job of breaking things out so you can focus on the pieces that matter to you. The report does an excellent job of showing how different demographics face different security risks, from different attackers, using different attack techniques. Instead of a bunch of numbers jumbled together, you can focus on the incidents most likely to affect your organization based on your size and industry.

You probably know that from the beginning, but now you have numbers to back you up.

But first:

If you are an information security professional, you must read this report. Don’t make decisions based on news articles, this post, or any other secondary analysis. It’s a quick read, and well worth your time, even if you only skim it.

Got it? There is a ton of good analysis in the report, and no outside summaries will cover the important things you need for making your own risk decisions. Not even ours. We could easily write a longer analysis of the DBIR than the DBIR itself.

Key context

Before we get any deeper, Verizon made two laudable decisions when compiling the report that might cause some hand wringing among those who don’t understand why:

  • They almost completely removed references to lost record counts such as the number of credit card numbers lost. The report is much more diverse this year, and record counts (which are never particularly useful in breach analysis) were just being misused and misunderstood. Only 15% of confirmed incidents had anything close to a measurable lost records count, so it made no sense to mention counts.
  • The report focuses on the 621 confirmed data loss incidents, not the 47,000 total incidents. Another great decision – most organizations have different definitions of ‘incident’, which made data normalization a nightmare. This is the Data Breach Investigations Report, not an analysis of every infected desktop on your network.

These two great decisions make the report much more focused and useful for making risk decisions.

A third piece of context is usually lost in much of the press coverage:

  • When the DBIR says something like “password misuse was involved in an incident”, it means it was one of multiple factors in the incident – not necessarily the root cause. Later in the report they tie in the first of the chain of attacks used, but you can’t read, “76% of network intrusions exploited weak or stolen credentials” as “76% of incidents were the result of weak or stolen credentials”. Attacks use chains of techniques, and these are only one factor. Context really is king because your goal is to break the attack chain at the most efficient and cost effective point.

The last piece of context is an understanding of what happens when 19 organizations participate. Some use VERIS (the open incident recording methodology published by Verizon) and others use their own frameworks. The Verizon risk team converts between methodologies as needed, and usually excludes data if there isn’t enough to cover the core needed to merge the data sets.

This means they sometimes have more or less detail on incidents, and they are clear about this in the report.

There is no way to completely avoid survey bias in a sample set like this – incidents must be detected to be reported, and a third party response team or law enforcement must be engaged for Verizon to get the data. This is why, for example, lost and stolen devices are practically nonexistent in this report. You don’t call Verizon or Deloitte for a forensics investigation when a salesperson loses a laptop.

Then again, we know of approximately zero cases where a lost device resulted in fraud. They definitely incur costs due to loss reporting and customer notification, but we can’t find any ties to fraud.

There is one choice we disagree with, and one area we hope they will drop, but they probably have to keep:

  • The DBIR includes many incidents of ATM skimming and other physical attacks that don’t involve network intrusion. These are less useful to the infosec audience, and we believe the banking community already has these numbers from other places. Tampering with ATMs in order to install skimmers is the vast majority of the ‘Physical’ threat action, which represents 35% of the breaches in the DBIR.
  • Year-over-year trends are nearly worthless now, due to the variety of contributors. It is a very different sample set from last year, the year before, or previous years. Perhaps if they filtered out only Verizon incidents, they could offer more useful trends. But people love these trend charts, despite the big changes in the sample set.

ATM skimming attacks are still data breaches, but the security controls to mitigate them are managed outside information security in most financial institutions. For the most part this doesn’t negatively affect the data too much, but we still think these incidents should be broken out more from the rest of the report, as record counts were removed. To compensate, Verizon does include a number of charts focusing on network intrusions, which are the pieces for information security pros to focus on.

In summary, this report:

  • Focuses on incidents with confirmed data loss.
  • Is based on investigations from 19 external investigation and response organizations.
  • Skews towards the kinds of incidents you would call an outside organization (such as Verizon or law enforcement) to help with.
  • Is multi-variate, so you need to understand the role of attack chains when analyzing the statistics.
  • Contains too many skimming/ATM attacks, but does filter them out in the most important places.

Finally, try to avoid printing the charts. The colors are really close and don’t come out well unless you do full color prints. I know, a pesky practical issue, but Rich is from Boulder and likes to save trees.

How to use the report

The first thing to understand is the two ways to use a report like this – which many of you already know, but we like to pretend we are getting paid by the word.

  1. Use the numbers to inform your executives. Focus on filtering the data for your organization, then cherry-pick from the charts and analysis to assemble your own internal briefing document. Don’t be an idiot and send the entire thing out to non-tech management – it is unlikely even your CIO wants to read most of the report – so focus on the bits that matter to your organization. If they do read the whole thing they will probably ask you solve ATM skimming. Even if you’re a CISO at a hospital or a power company. Joy!
  2. Within your security and risk team, use it to evaluate your risk based on your industry, size, and the kinds of data you deal with. Then extract the relevant attack trends and techniques, match them against your environment, and see where you can best break attack chains.

Here are some specific tidbits and areas to focus on:

  • There are several box charts to show 3 variables at the same time (e.g., industry, size, and number of incidents). These are generally the best places to tease out and quickly visualize data. Make sure you double check which variables are represented because it can be easy to misread the labels – especially which filters were applied.
  • Figure 3 breaks out network intrusions by industry. Financial services isn’t even in the top 5 but retail is. It also breaks out espionage vs. financial crime vs. activism. This is the key chart for pegging your industry and nature of attacks.
  • Figures 6 and 7 demonstrate that only a few combinations of actor, action, asset, and attribute make up the vast majority of incidents. The key takeaway is that attackers are predictable in the aggregate – not that you can completely ignore outliers.
  • Figure 8 is one of the most useful high-level charts. It shows the differences in attack targets and attack techniques based on the nature of an attack (espionage vs. financial, etc.). It supports many of our industry’s assumptions with data that spans all kinds of attacks, which is awesome.
  • Table 1 summarizes a bunch of threat actor information very well. You can skip a bunch of the other sections and focus on that to start if you fall into one of the included categories.
  • Broad Street is a malware propagation taxonomy developed by Microsoft, which Verizon adopted this year. It appears on page 32 and shows how (among the incidents they could classify) the malware hits an organisation and spreads. This is one of the single most useful diagrams in the report, because it shows how attacks chain and propagate.
  • Save yourself some time. Phishing works, almost without fail, and is the top initial attack technique for espionage. User education will only reduce it so far, so design controls around the understanding that phishing succeeds to some degree. See pages 37-38 for the details. If you know your desktops are going to fall to malware from clicks and attachments, perhaps you need to figure out how to design the network so that lateral movement is more difficult, and assume that credentials will to be replayed on your network.
  • Figure 35 shows what data is targeted by what threat actor. Very useful, if not surprising.
  • Breach timelines (how long it from penetration before you discover an incident) vary widely by industry. Page 51 has the details.
  • Mobile/cloud breaches are mostly hype for now. At least according to this data. Part of the problem, though, is that this data ignores stolen and lost user devices… (See below in Error and also note that “It is important to disclose that physical threats are extremely common but underrepresented in this report. For instance, stolen user devices are less likely to receive a forensic investigation to confirm data compromise or fall under the jurisdiction of our law enforcement contributors.” )

In the end, you will get a better idea of how your industry is successfully attacked. The next step is to map this to your own organization and figure out how to focus your security most effectively and efficiently to stop those kinds of attacks. It won’t be perfect but you have the numbers to justify your decisions and present them to the budget authorities.

For example, if you are in an industry targeted for espionage, you know phishing will be the primary vector (for now, at least). You also know that education alone won’t fix it (again, per the report), and that phishing will be used to drop malware, steal credentials, and then snag data from internal systems. So you can better justify shifting security spending to break this chain, and away from other important projects which are less relevant to your primary risks.

We know, wishful thinking.

A few other points:

  • It is important to note that ‘Error’ is a huge contributor to the 47,000 security incidents (48%), yet it is considered the cause of only 2% of breaches. “Error consists mainly of lost devices, publishing goof-ups, and mis-delivered e-mails, faxes, and documents that potentially expose sensitive information.” Again, people might not call a forensics specialist to confirm that someone published a document on a public web server, or sent a sensitive spreadsheet to the auto-complete email address instead of to the rightful recipient. That doesn’t mean it’s not a mandatory reporting issue if it contains PII, and things can certainly get awkward if pre-release financials slip out.
  • Reading the DBIR leads you to believe that all the internal methods (HIDS, NIDS, Log Review, etc.) are essentially useless (less than 1% each) except for “Financial Audit” and “Reported by user”. This is a scary thing to see continuing, and we need to change this trend. Log review, for example, can be very effective if targeted toward the small subset of assets that are most important (back to bonsai focus rather than just spreading manure over the whole field).
  • Mandiant’s M-Trends 2013 report shows a dramatic improvement in self-discovery of breaches: A whopping 37% of their clients self-discovered breaches in 2012, compared to 6% in 2011. That is an interesting tidbit which we think relates to the organizational maturity of those kinds of organizations, but it doesn’t show up as well in the DBIR.
  • We don’t see NGOs broken out as a demographic, but based on other data we think they are increasingly being targeted, depending on what they do. It would be interesting to see whether future reports support this belief.

Hopefully these tidbits will help you take your first pass through the report. We found it very useful, and although it largely validates institutional beliefs, it does so with usable data.

Finally, all these reports are snapshots in time. They can help you orient your defenses but don’t assume attackers will sit back idly – especially if you are in a targeted sector. But every attack path we slow down increases their costs, which is always a good thing.