I received a ton of great responses to my initial post looking for survey input on what people want to see in a data security survey. The single biggest request is to research control effectiveness: which tools actually prevent incidents.
Surveys are hard to build, and while I have been involved with a bunch of them, I am definitely not about to call myself an expert. There are people who spend their entire careers building surveys. As I sit here trying to put the question set together, I’m struggling for the best approach to assess outcome effectiveness, and figure it’s time to tap the wisdom of the crowd.
To provide context, this is the direction I’m headed in the survey design. My goal is to have the core question set take about 10-15 minutes to answer, which limits what I can do a bit.
Section 1: Demographics
The basics, much of which will be anonymized when we release the raw data.
Section 2: Technology and process usage
I’ll build a multi-select grid to determine which technologies are being considered or used, and at what scale. I took a similar approach in the Project Quant for Patch Management survey, and it seemed to work well. I also want to capture a little of why someone implemented a technology or process. Rather than listing all the elements, here is the general structure:
- Technology/Process
- Not Considering
- Researching
- Evaluating
- Budgeted
- Selected
- Internal Testing
- Proof of Concept
- Initial Deployment
- Protecting Some Critical Assets
- Protecting Most Critical Assets
- Limited General Deployment
- General Deployment
And to capture the primary driver behind the implementation:
- Technology/Process
- Directly Required for Compliance (but not an audit deficiency)
- Compliance Driven (but not required)
- To Address Audit Deficiency
- In Response to a Breach/Incident
- In Response to a Partner/Competitor Breach or Incident
- Internally Motivated (to improve security)
- Cost Savings
- Partner/Contractual Requirement
I know I need to tune these better and add some descriptive text, but as you can see I’m trying to characterize not only what people have bought, but what they are actually using, as well as to what degree and why. Technology examples will include things like network DLP, Full Drive Encryption, Database Activity Monitoring, etc. Process examples will include network segregation, data classification, and content discovery (I will tweak the stages here, because ‘deployment’ isn’t the best term for a process).
Section 3: Control effectiveness
This is the tough one, where I need the most assistance and feedback (and I already appreciate those of you with whom I will be discussing this stuff directly). I’m inclined to structure this in a similar format, but instead of checkboxes use numerical input.
My concern with numerical entry is that I think a lot of people won’t have the numbers available. I can also use a multiselect with None, Some, or Many, but I really hate that level of fuzziness and hope we can avoid it. Or I can do a combination, with both numerical and ranges as options. We’ll also need a time scale: per day, week, month, or year.
Finally, one of the tougher areas is that we need to characterize the type of data, its sensitivity/importance, and the potential (or actual) severity of the incidents. This partially kills me, because there are fuzzy elements here I’m not entirely comfortable with, so I will try and constrain them as much as possible using definitions. I’ve been spinning some design options, and trying to capture all this information without taking a billion hours of each respondent’s time isn’t easy. I’m leaning towards breaking severity out into four separate meta-questions, and dropping the low end to focus only on “sensitive” information – which if lost could result in a breach disclosure or other material business harm.
- Major incidents with Personally Identifiable Information or regulated data (PII, credit cards, healthcare data, Social Security Numbers). A major incident is one that could result in a breach notification, material financial harm, or high reputation damage. In other words something that would trigger an incident response process, and involve executive management.
- Major incidents with Intellectual Property (IP). A major incident is one that could result in material financial harm due to loss of competitive advantage, public disclosure, contract violation, etc. Again, something that would trigger incident response, and involve executive management.
- Minor incidents with PII/regulated data. A minor incident would not result in a disclosure, fines, or other serious harm. Something managed within IT, security, and the business unit without executive involvement.
- Minor incidents with IP.
Within each of these categories, we will build our table question to assess the number of incidents and false positive/negative rates:
- Technology
- Incidents Detected
- Incidents Blocked
- Incidents Mitigated (incident occurred but loss mitigated)
- Incidents Missed
- False Positive Detected
- Per Day
- Per Month
- Per Year
- N/A
There are some other questions I want to work in, but these are the meat of the survey and I am far from convinced I have it structured well. Parts are fuzzier than I’d like, I don’t know how many organizations are mature enough to even address outcomes, and I have a nagging feeling I’m missing something important.
So I could really use your feedback. I’ll fully credit everyone who helps, and you will all get the raw data to perform your own analyses.
Reader interactions
4 Replies to “How to Survey Data Security Outcomes?”
Pablo-
This is excellent. You totally nailed the problem I’ve been struggling with. Those scoping questions may be more subjective than I like, but they are probably required to get the proper context for the effectiveness questions.
So does that mean you also suggest changes in how we try and capture the numerator?
In terms of control effectiveness, I would suggest to incorporate another section aside from ‘number of incidents’ where you question around unknowns and things they sense are all over the place but have not way of knowing/controlling.
I’ll break out my comment in two parts: 1- “philosophical remarks” and 2- suggestions on how to implement that in your survey
1- “philosophical remarks”
If you think about it, effectiveness is the ability to illustrate/detect risks and prevent bad things from happening. So, in theory, we could think of it as a ratio of “bad things understood/detected” over “all existing bad things that are going on or could go on” (by ‘bad things’ I mean sensitive data being sent to wrong places/people, being left unprotected, etc. – with ‘wrong/bad’ being a highly subjective concept)
So in order to have a good measure of effectiveness we need both the ‘numerator’ (which ties to your question on ‘number of incidents’) and also a ‘denominator’
The ‘denominator’ could be hard to get at, because, again, things are highly subjective, and what constitutes ‘sensitive’ changes in the view of not only the security folks, but more importantly, the business. (BTW, I have a slight suggestion on your categories that I include at the bottom of this post)
However, I believe it is important that we get a sense of this ‘denominator’ or at least the perception of this ‘denominator’. My own personal opinion on this, by speaking to select CISOs is they feel things are ‘all over the place’ (i.e., the denominator is quite quite large).
2- Suggestions on how to implement that in your survey
To get at the denominator, I suggest including questions such as
a) Out of the universe of all sensitive data that you believe is in use in your organization, how much of that you feel is effectively protectable by current technologies?
– most of it
– a majority of it
– some of it
– a small percentage of it
– very little of it
b) do you feel you have a good enough sense of all the information deemed critical by the business that requires data-centric protection?
– highly confident
– somewhat confident
– somewhat unconfident
– highly unconfident
These are just some ideas to illustrate my point
But SmithWill, how the heck to effectively survey that?
My goal with what I proposed above is to get an idea of what specific technologies and processes are working, which I don’t see in what you suggest. That could be my own myopathy and lack of sleep, to be honest.
Can you give an example (even if very short) of how you think we should survey what you suggest? (serious question, I really want to get some valuable/useful results here).
Outcomes should match expectation and policies. Period. It’s as simple as defining which well-defined and approved business applications map to which communication ports at the perimeter. Next, define authorized user access internal and externally. If an unapproved external network is connected to your systems, you got some work to do. Same with internal users poking around aimlessly on in the cloud. Lastly, approved content. What is allowed into the network? Web content? Email and so forth. It’s a matter of matching reality with policy and in SPITE OF all the security contraptions wired together. UTM, SIEM, SPUDS, BUDS and Fairy Dust can only do so much. At the the end of the day a survey should outline all the aforementioned details and management should be able to fairly quickly and easily say yea and nay to what goes and stays.