It’s About The Fraud, Not The Breaches

By Rich

Thanks in large part to the data loss database, there’s recently been some great work on analyzing breaches. I’ve used it myself to produce some slick looking presentation graphs and call attention to the ever-growing data breach epidemic.

But there’s one problem. Not a little problem, but a big honking leviathan lurking in the deep with a malevolent gleam in its black eyes.

Breach notification statistics don’t tell us anything, at all, about fraud or the real state of data breaches.

The statistics we’re all using are culled from breach notifications- the public declarations made by organizations (or the press) after an incident occurs. All a notification says is that information was lost, stolen, or simply misplaced. Notifications are a tool to warn individuals that their information was exposed, and perhaps they should take some extra precautions to protect themselves. At least that’s what the regulations say, but the truth is they are mostly a tool to shame companies into following better security practices, while giving exposed customers an excuse to sue them.

But notifications don’t tell us a damn thing about how much fraud is out there, and which exposures result in losses.

(Okay- the one exception is that any notification results in losses for a business that goes through the process).

In other words, we don’t know which of the myriad of exposures we read about daily in the press result in damages to those whose records were lost. They are also self reported; and I know for a fact there are incidents where companies did not disclose because they didn’t think they’d get caught.

For example, based on the statistics nearly a third of all breach notifications are the result of lost laptops, computers, and portable media (around 85 million records, out of around 316 million total lost records). About 51 million of those records were the result of two incidents (the VA in the US, and HMRC in the UK). The resulting fraud?

Unknown. No idea. Zip. Nada. In all those cases, I don’t know of a single one where we can tie the fraud to the lost data.

In some cases we really can track back the fraud. TJX is a great example, and the losses may be in the many tens of millions of dollars. ChoicePoint is another example, with 800 cases of identity theft resulting from 163,000 violated records (a number that’s probably really around 500,000, but ChoicePoint limited the scope of their investigation).

What we need are fraud statistics, not self-reported breach notification statistics. We do the best with what we have, but according to the notification stats we should all be encrypting laptops before we secure our web applications, yet the few fraud statistics available support the contrary conclusion.

In other words, we do not have the metrics we need to make informed risk management decisions.

This also creates a self-fulfilling negative feedback loop. Notifications result in measurable losses to businesses, driving security spending to prevent incidents that cause notifications, which may not represent prioritized security/loss risks.

When you read these things, especially on the slides shoved down your throat by desperate vendors (it’s usually slide 2 or 3), ask yourself if each one is an exposure, or actual fraud.

No Related Posts

[...] do not correlate with fraud. Something else we’ve discussed here before. In short, there isn’t necessary any correlation between a “breach” notification and any ac…. Thus the value of breach notification statistics is limited. A lost backup tape may contain 10 [...]

By The Breach Reporting Dillema |

Ahh, this old chestnut.

Lets look at some pseudo-equations from security textbooks to answer this:

Risk = (Threat x Vulnerability x Asset Value)/Countermeasures

Or in the context of fraud (wih apologies to Cendrowski et al):

Fraud = Motive x Opportunity x Rationale

Remove opportunity = remove vulnerabilities.
Remove rationale = reduce perceived transfer value of the data.

As an aside Richard, don’‘t you think the key to determining asset value in the context of fraudulent data breaches is to ask "what is the price put on this data by its most malicious user?" - thereby assessing the maximum ‘‘opportunistic transfer value’’ or using the ‘‘highest bidder’’ scenario? After all, in an uncontrolled environment, information ultimately will "seek" the highest bidder, regardless of their legitimacy ... !

So, to measure the threat of fraud, we have to assess the motive. To assess the motive, we have to measure intention. To measure intention we have to look at audit context (what happened before and after the breach), possibly in the context of historical user activity. To measure that, we can either trawl through audit logs saying "ooh look they logged on late at night and printed 100 pages then emailed themselves their mailbox file", or we can employ some sort of heuristical analysis or similar smartness and instead of wading through audit looks we have to wade through (slightly fewer) false positives.

Surely this is a more sensible way of looking at the problem?

By Paul Owen

[...] fraud with real world incidents stand alone on the peak of the rainbow bridge to metrics nirvana. I’ve written about our need for fraud statistics, not breach statistics, but often feel like I’m just banging my head against the hard, thick walls of big [...]

By New Identity Theft Stats |


No- I agree completely, but the current breach statistics ONLY refer to loss of PII. Breach notification stats have absolutely nothing to do with loss of intellectual property.

This post only refers to breach notification stats, which have no correlation with real breaches. Make sense?

Two different problems, and this post only refers to the one where fraud stats really are more important.

By rmogull

Your view is too narrow. You make it sound like the only data worth stealing, is information which can be used to defraud an individual.

Company secrets, financial inforamtion, plans, developments, code, news, conversations, etc. isn’‘t used for fraud. However, it can be used for gain and or profit at the cost of the original owner.

If you just want "fraud" statistics. Call up various law inforcement agencies and ask them how many complaints have been filed for "computer" or "Internet" fraud.

From a COMPUSEC perspective, it is still about the "breach".
Metrics are there for a risk assessment. Which again, is why it is about the breach. The "breach" (albeit in slightly different quantifiers) is a major factor for the "cost" numbers and "probability" numbers.

Breach also has other factors. If data is removed from the backend of a secure database system onto a laptop which is then carried outside of the building; is this a breach? How does it change your RA? The cost of loss changes little, but your probability is quite a bit different.

To predict the probability… I’‘m not going to use metrics from "Fraud" I am going to use a numbers from "Breaches".

Also; there are multipliers you have to concern yourself with. A lost laptop could result in the loss of one bit of data or a billion bits. To prevent this, are you considering the number of bits the criminal attains, or will you work on ensuring he cannot "BREACH" whatever contains it?

By Aodhhan

[...] Rich Mogull. URL: (click HERE) Rich has talked in the past about how breach reporting is not as useful as reporting on how [...]

By Carnival of the Security Catalyst Community - Apri

One other aspect to consider is that one generally does not change their identity information.  Hence, that information has long term value, while our quest for metrics has a short attention span.  I can steal your identity information today, and then steal your identity next year.  Fine by me.  Trying to tie the occurence of fraud, and the time proximity of that occurence, to a data leak reinforces the short attention span and assumes that attackers will immediately make use of all the data they steal.

By ds

As long as in some countries (read: the US)one can borrow someone else’s identity by using SSN and some supportive data, (e.g. mother maiden name) the legislators will have to provide special treatment to "loss" of data (records reported "missing" under breach notification regulation), in order to provide a preemptive solution to <B>potential</B> identity theft. We can argue about the cost per "lost record", but the fact is that corporate America will pay whether or not fraud took place. This dictates how the market accept breach notification statistics (enterprises and consumers) or uses the breach statistics (vendors). The potential damage dictates the risk. Different identity systems would adopt different risk factoring methods: in the US the risk of identity theft as a result of lost records is higher than in other countries that use physical identification (government issued, pictured identification).

Now, we do have some statistics that shows the results of targeted attacks (e.g. hacking through web application). Maybe the industry need a different factor.

To summarize: I agree that not all breaches are created equal. We need to factor *some* kind of breaches differently. I also agree with you that breach notification statistics don’t tell us anything about fraud. I disagree that it matters the same in all cases or all countries.

By Sharon

The link between breaches and fraud is indeed intersting.  A datapoint: New England grocery chain Hannaford Brothers says a security breach has exposed 4.2 million of customers’ credit- and debit-card numbers to scammers, with 1,800 fraud cases reported so far (as of March 18, 2007).  However, even without fraud ‘‘imprudent curiosity’’ (as experienced by Britney Spears, Barack Obama and others recently) can still cause consumer outrage.  Privacy violations in addition to fraud are a very nasty outcome.  Maybe that is just my European heritage speaking.

By Dominique Levin


Brilliant article once again.

This sums up quite nicely my argument against Risk assessment methodology.

Risk assessment comes down to "the chance something will happen" times "the cost of the something".

This essentially boils down to a fairly educated guess times another fairly educated guess.

Actually… "the cost of the something" is probably easy to work out because the company can make a pretty good estimation for "somethings" like "our competitors find out about a new product release before it happens" and "our competitors find out how much we pay for raw material A" or "another company gets hold of our customer list".

So, our industry really needs to know how data is lost and what happens with it after. This can only be done by a "I’‘ll show you mine if you show me yours" sharing climate.

By Allen Baranov

If you like to leave comments, and aren’t a spammer, register for the site and email us at and we’ll turn off moderation for your account.