Login  |  Register  |  Contact
Wednesday, May 26, 2010

DB Quant: Discovery And Assessment Metrics (Part 2) Identify Apps

By Adrian Lane

Now that we know where the databases are located, we need to find sensitive data inside them, determine how applications connect to databases, and what database features and functions the applications depend on. Applications are often inflexible, requiring particular user accounts or connection types to function properly. They may even be coded to use database features that are considered vulnerabilities by the security team. Data discovery is key, because of course it’s necessary to know the type and location of sensitive data before controls can be established. The entire scanning process require special access provided by the owners of the databases, as well as the platforms and networks that support them.

For some of you in small and medium businesses, especially in cases where you are the sole database administrator, these granular steps will seem like overkill. For mid-to-large enterprises, with hundreds of databases supporting thousands of applications with sensitive data scattered throughout them, these steps are necessary for forming security policies and meeting compliance. Also consider that some of the automated scanning tools behave like a virus or an attacker, requiring both credentials to access the DB and coordination with security countermeasures and staff.

As a reminder, the process is as follows:

  1. Plan
  2. Setup
  3. Identify Dependent Applications
  4. Identify Database Owners
  5. Discover Data
  6. Document


Variable Notes
Time to assemble list of databases Feeds from the Enumerate Databases step
Time to define data types of interest The sensitive data you want to discover, such as credit card numbers
Time to map locations and schedule scans Databases will reside on different domains, subnets, etc. This is the time to develop a scanning plan based on location


Variable Notes
Capital and time to acquire tools for discovery automation Optional – DB discovery tools from previous phase may provide this
Time to define patterns, expressions, and signatures e.g., what sensitive data looks like
Time to contact business units & network staff
Time to configure discovery tool Optional

Identify Dependent Applications

Variable Notes
Time to schedule and perform review/run scan
Time to identify applications using the database Based on connections and/or service account credentials
Time to catalog application dependencies and connection types Most items can be discovered without DB credentials
Time to repeat steps As needed

Identify Database Owners

Variable Notes
Time to identify database owners The real-world owner, not just the DBA account name
Time to obtain access and credentials Usually a dedicated account is established for this analysis

Discover Data

Variable Notes
Time to schedule and run scan For automated scans
Time to compile table/schema locations For manual discovery
Time to examine schema and data For manual discovery
Time to adjust rules and repeat scans For automated scans


Variable Notes
Time to filter results and compile report Gather data names, types, and location
Time to generate report(s)

Other Posts in Project Quant for Database Security

  1. An Open Metrics Model for Database Security: Project Quant for Databases
  2. Database Security: Process Framework
  3. Database Security: Planning
  4. Database Security: Planning, Part 2
  5. Database Security: Discover and Assess Databases, Apps, Data
  6. Database Security: Patch
  7. Database Security: Configure
  8. Database Security: Restrict Access
  9. Database Security: Shield
  10. Database Security: Database Activity Monitoring
  11. Database Security: Audit
  12. Database Security: Database Activity Blocking
  13. Database Security: Encryption
  14. Database Security: Data Masking
  15. Database Security: Web App Firewalls
  16. Database Security: Configuration Management
  17. Database Security: Patch Management
  18. Database Security: Change Management
  19. DB Quant: Planning Metrics, Part 1
  20. DB Quant: Planning Metrics, Part 2
  21. DB Quant: Planning Metrics, Part 3
  22. DB Quant: Planning Metrics, Part 4
  23. DB Quant: Discovery Metrics, Part 1, Enumerate Databases

—Adrian Lane

Quick Wins with DLP Presentation

By Rich

Yesterday I gave this presentation as a webcast for McAfee, but somehow my last 8 slides got dropped from the deck. So, as promised, here is a PDF of the slides.

McAfee is hosting the full webcast deck over at their blog. Since we don’t host vendor materials here at Securosis, here is the subset of my slides. (You might still want to check out their full deck, since it also includes content from an end user).

Presentation: Quick Wins with DLP


Gaming the Tetragon

By Mike Rothman

Rich highlighted a great post from Rocky DiStefano of Visible Risk in today’s Incite:

Blame the addicts – When I was working at Gartner, nothing annoyed me more than those client calls where all they wanted me to do was read them the Magic Quadrant and confirm that yes, that vendor really is in the upper right corner. I could literally hear them checking their “talked to the analyst” box. An essential part of the due diligence process was making sure their vendor was a Leader, even if it was far from the best option for them. I guess no one gets fired for picking the upper right. Rocky DeStefano nails how people see the Magic Quadrant in his Tetragon of Prestidigitation post. Don’t blame the analyst for giving you what you demand – they are just giving you your fix, or you would go someplace else. – RM

Rocky is dead on – there are a number of constituencies that leverage information like the Magic Quadrant, and they all have different perspectives on the report. I don’t need to repeat what Rocky said, but I want to add a little more depth about each of the constituencies and provide some anecdotes from my travels.

To be clear, Gartner (and Forrester, for that matter) place all sorts of caveats on their vendor rankings. They say not to use them to develop a short list, and they want clients to call to discuss their specific issues. But here’s the rub: They know far too many organizations use the MQ as a crutch to support either their own laziness and stupidity, or to play the game and support decisions they’ve already made.

Institutionally they don’t care. As Rich pointed out, (most of) the analysts hate it. But the vendor rankings represent enough revenue that they don’t want to mess with them. Yes, that’s a cynical view, but at the end of the day both of the big IT research shops are public companies and they have to cater to shareholders. And shareholders love licensing 10-page documents for $20K each to 10 vendors.

Rocky uses 3 cases to illuminate his point, first a veteran information security professional, and those folks (if they have a clue) know that they’ve got to focus their short list on vendors close to the Leader Quadrant. If not, they’ll spend more time justifying another lesser-ranked vendor than implementing the technology. It’s just not worth the fight. So they don’t. They pick the best vendor from the leader quadrant and move on.

This leads us to the second case, the executive, who basically doesn’t care about the technology, but has a lot of stuff on his/her plate and figures if a vendor is a leader, they must have lots of customers calling Gartner and their stuff can’t be total crap. Most of the time, they’d be right.

And the third case is vendors. Rocky makes some categorizations about the different quadrants, which are mostly accurate. Vendors in the “niche” space (bottom left) don’t play into the large enterprise market, or shouldn’t be. Those in the “challenger” quadrant (top left) are usually big companies with products they bundle into broad suites, so the competitiveness of a specific offering is less important.

Those in the “visionary” sector (bottom right) delude themselves into thinking they’ve got a chance. They are small, but Gartner thinks they understand the market. In reality it doesn’t matter because the vast majority of the market – dumb and/or lazy information security professionals – see the MQ like this:

Dumb and Lazy is no way to go through life...

In most enterprise accounts the only vendors with a chance are the ones in the leader quadrant, so placement in this quadrant is critical. I’ve literally had CEOs and Sales VPs take out a ruler and ask why our arch-nemesis was 2mm to the right of our dot. 2 frackin millimeters. You may think I’m kidding, but I’m not.

So many of the high-flying vendors make it their objective to spend whatever resources it takes to get into the leader quadrant. They have customers call into Gartner with inquiries about their selection process (even though the selection is already made) to provide data points about the vendor. Yes, they do that, and the vendors provide talking points to their clients. They show up at the conferences and take full advantage of their 1on1 meeting slots. They buy strategy days.

To be clear, you cannot buy a better placement on the MQ. But you can buy access, which gives a vendor a better opportunity to tell their story, which in many cases results in better placement. Sad but true. Vendors can game the system to a degree.

Which is why Rich, Adrian, and I made a solemn blood oath that we at Securosis would never do a vendor ranking. We’d rather focus our efforts on the folks who want advice on how to do their job better. Not those trying to maximize their Tetris time.

—Mike Rothman

Code Re-engineering

By Adrian Lane

I just ran across a really interesting blog post by Joel Spolsky from last April: Things You Should Never Do, Part 1. Actually. the post pissed me off. This is one of those hot-button topics that I have had to deal with several times in my career, and have had to manage in the face of entrenched beliefs. His statement is t hat you should never rewrite a code base from scratch. The reasoning is “No major firm has ever successfully survived a product rewrite. Just look at Netscape … ” Whatever.

I am a fixer. I was the guy who was able to make code reliable. I was the guy who found and fixed the obscure bugs. As I progressed in my career and started to manage teams of developers, more often than not I was handed the really crummy re-engineering projects because I could fix the problems and make customers happy. Sometimes success is its own penalty.

I have inherited code so bad that bug fixes cost 4x in time and usually created new bugs in the process. I have inherited huge bodies of Java code written entirely as if Java were a 3G procedural language – ignoring the object-oriented paradigm completely. I have been tasked with fixing code that – for a simple true/false comparison – made 12 comparisons, 8 database, insertions and 7 deletions – causing an 180x performance penalty. I have inherited code so bad it broke the compiler. I have inherited code so bad that you could not change a back-end database query without breaking the GUI! It takes a real gift for bad programming to do these things.

There are times when the existing code – all or part – simply needs to be thrown away. There are times that code is so tightly intertwined that you cannot simply fix one piece at a time. And in some cases there are really good business reasons, like your major customers say your code is crap and needs to be thrown away. Bad code can bleed a company to death with lost sales, brand impairment, demoralization, and employee turnover.

That said, I agree with Joel’s basic premise that re-writing your product can kill your company. And I even agree about a lot of the social behaviors he describes that create failure. There is absolutely no reason to believe that the people who developed bad code the first time will not do the same thing the next time. But I don’t agree that you should never rewrite. I don’t agree that it has never been done successfully. I know because I have done it successfully. Twice. Out of three attempts, but hey, I got the important projects right.

We tend not to hear about successful rewrites because the companies that carried it off really don’t want everyone knowing that previous versions were terrible. They would rather focus on happy customers and competitive products. It’s very likely that companies who need to rewrite code will screw up a second time. Honestly, there are a lot more historic rewrite flameouts than success stories. Companies know what they want to fix in the code, but they don’t understand what they need to fix in the company. I contend this is because there are company behaviors that promote failure, and if they did it once, they are likely to do it again. And again. Until, mercifully, the company goes down in flames. There are a lot of reasons why re-architecture and re-implementations projects fail. In no particular order …

  1. Big eyes: You are the chief developer and you hate your current product. You have catalogued everything that is wrong with it and how you would fix it. You have extensive lists of features you would like to implement. You have a grand vision of how this product should function, how it should be architected, and how it will be implemented. This causes your re-engineering effort to fail because you think that you are going to build perfect software, tackle every problem, and build every feature, in the first revision. And you commit to do so, just to get the project green-lighted.
  2. Resources: You current product sucks. It really sucks. It has atrocious quality and low performance, and is miserable to manage. It’s so freaking bad that customers ask for their money back, and sales falter. This causes your re-engineering effort to fail because there is simply not enough time, and not enough revenue to pay for your rebuild. Not with customers breathing down management’s neck, and investors looking for the quick “liquidity event”. So marketing keeps on marketing, sales keeps on selling, and you keep on supporting the old mess you have.
  3. Bad blood: When you car gets old and dies, you don’t expect someone to give you a new one for free. When your crappy old code no longer supports your customers, in essence you need to pay for new code. Yes, it is unfortunate that you bought a lemon last time, but you need to make additional investments in time and development resources, and fix the problems that led you down the wrong path. Your project fails because management is so bitter about the failure that they muck around with development practices, apply more pressure and try to get more involved with day-to-day development, when the opposite is needed.
  4. Expectations: Not only is the development team excited at not having to work on the atrocious code you have now, but they are really looking forward to working on a product that has semi-modern design. The whole department is buzzing, and so is management! This causes your re-engineering effort to fail because the Chickens think that no only are you going to deliver perfect software, but you are going to deliver every feature and function of the old crappy product, as well as a handful of new and extraordinary features as well. And it’s unlikely that management will let you adjust the ship date to accommodate the new demands. If they do, the temptation is to keep working until it’s perfect, but nothing is perfect – this is a good way to keep coding while Rome burns.
  5. Sales: In all the excitement, Sales bragged to a couple major customers about what an amazing new product the development team is building, and it solves all the problems you have today. The customers think this is great, and say “Call us when it’s ready.” Sales grind to a halt. This causes your re-engineering effort to fail because you now have two months to develop what you estimated would take 18.
  6. People: The people who were terrible coders are still on the team because nobody wanted to fire them. The managers who forced releases out the door early, before implementation and QA were completed are still with the company. The executive who threatens employees’ jobs if they fail to deliver on time are still with the company. The Product Manager who fails to do market research to validate bright ideas is still at the company. The engineering ‘leaders’ with no clue about process or leadership skills are still leading when they should be coding. The effort failed before it began.

Re-engineering efforts can fail for a whole new set of reasons, in addition to whatever wrecked the initial project. And unfortunately rewrites always begin at a disadvantage, because management is already miffed that the last development project failed. Building software is risky, but re-engineering can work. If you want to get it right the second time, you need to perform a same critical evaluation of people and processes, just as you hopefully did with technology. You will end up overhauling much of the organization, including management, to avoid the technical and leadership failures of the past. If all of this has not scared you off, consider code re-engineering.

—Adrian Lane

Incite 5/26/2010: Funeral for a Friend

By Mike Rothman

I don’t like to think of myself as a sentimental guy. I have very few possessions that I really care about, and I don’t really fall into the nostalgia trap. But I was shaken this week by the demise of a close friend. We were estranged for a while, but about a year ago we got back in touch and now that’s gone.

Lots of miles on this leather... I know it’s surprising, but I’m talking about my baseball glove, a Wilson A28XX, vintage mid-1980’s. You see, I got this glove from my Dad when I entered little league, some 30+ years ago. It was as big as most of my torso when I got it. The fat left-handed kid always played first base, so I had a kick-ass first baseman’s glove and it served me well. I stopped playing in middle school (something about being too slow as the bases extended to 90 feet), played a bit of intramural in college, and was on a few teams at work through the years.

A few of my buddies here in ATL are pretty serious softball players. They play in a couple leagues and seem to like it. So last year I started playing for my temple’s team in the Sunday morning league with lots of other old Jews. I dug my glove out of the trunk, and amazingly enough it was still very workable. It was broken in perfectly and fit my hand like a glove (pun intended). It was like a magnet – if the ball was within reach, that glove swallowed it and didn’t give it up.

But the glove was showing signs of age. I had replaced the laces in the webbing a few times over the years, and the edges of the leather were starting to fray. Over this weekend the glove had a “leather stroke”, when the webbing fell apart. I could have patched it up a bit and probably made it through the summer season, but I knew the glove was living on borrowed time.

So I made the tough call to put it down. Well, not exactly down, since the leather is already dead, but I went out and got a new glove. Like with a trophy wife, my new glove is very pretty. A black leather Mizuno. No scratches. No imperfections. It even has a sort-of new-car smell. I’ll be breaking it in all week and hopefully it’ll be ready for practice this weekend.

For an anti-nostalgia guy, this was actually hard, and it will be weird taking the field with a new rig. I’m sure I’ll adjust, but I won’t forget.

– Mike

Photo credits: “Leather and Lace” originally uploaded by gfpeck

Incite 4 U

I want to personally thank Rich and the rest of the security bloggers for really kicking it into gear over the past week. Where my feed reader had been barren of substantial conversations and debate for (what seemed like) months, this week I saw way too much to highlight in the Incite. Let’s keep the momentum going. – Mike.

  1. Focus on the problem, not the category – Stepping back from my marketing role has given me the ability to see how ridiculous most of security marketing is. And how we expect the vendors to lead us practitioners out of the woods, and blame then when they find another shiny object to chase. I’m referring to NAC (network access control), and was a bit chagrined by Joel Snyder’s and Shimmy’s attempts to point the finger at Cisco for single-handedly killing the NAC business. It’s a load of crap. To be clear, NAC struggled because it didn’t provide must-have capabilities for customers. Pure and simple. Now clearly Cisco did drive the hype curve for NAC, but amazingly enough end users don’t buy hype. They spend money to solve problems. It’s a cop-out to say that smaller vendors and VCs lost because Cisco didn’t deliver on the promise of NAC. If the technology solved a big enough problem, customers would have found these smaller vendors and Cisco would have had to respond with updated technology. – MR

  2. I can haz your ERP crypto – Christopher Kois noted on his blog that he had ‘broken’ the encryption on the Microsoft Dynamics GP, the accounting package in the Dynamics suite from the Great Plains acquisition. Encrypting data fields in the database, he noticed odd behavioral changes when altering encrypted data. What he witnessed was that if he changed a single character, only two bytes of encrypted data changed. With most block ciphers, if you change a single character in the plaintext, you get radically different output. Through trial and error he figured out the encryption used was a simple substitution cipher – and without too much trouble Kois was able to map the substitution keys. While Microsoft Dynamics does run on MS SQL Server, there are some components that still rely upon Pervasive SQL. Christopher’s discovery does not mean that MS SQL Server is secretly using the ancient Caesar Cipher, but rather that some remaining portion Great Plains does. It does raise some interesting questions: how do you verify sensitive data has been removed from Pervasive? If the data remains in Pervasive, even under a weak cipher, will your data discovery tools find it? Does your discovery tool even recognize Pervasive SQL? – AL

  3. Blame the addicts – When I was working at Gartner, nothing annoyed me more than those client calls where all they wanted me to do was read them the Magic Quadrant and confirm that yes, that vendor really is in the upper right corner. I could literally hear them checking their “talked to the analyst” box. An essential part of the due diligence process was making sure their vendor was a Leader, even if it was far from the best option for them. I guess no one gets fired for picking the upper right. Rocky DeStefano nails how people see the Magic Quadrant in his Tetragon of Prestidigitation post. Don’t blame the analyst for giving you what you demand – they are just giving you your fix, or you would go someplace else. – RM

  4. Compliance and security: brothers in arms – It’s amazing to me that we are still gnashing our teeth over the fact that senior management budgets for compliance and doesn’t give a rat’s ass about security. Also nice to see Anton emerge from his time machine trip back to 2005, and realize that compliance doesn’t provide value. Continuing the riff on AndyITGuy’s rant about compliance vs. security, we have to eliminate that line of thinking. Compliance and security are not at odds. It’s not an either/or proposition. Smart practitioners buy solutions to security problems, which can be positioned and paid for out of the compliance budget. When pitching any security project to senior management, you’ve got no shot unless you can either show how it increases the top line (pretty much impossible), decreases spend (hard), or helps meet a compliance mandate. So stop thinking compliance is the enemy. It’s your friend – your rich friend who needs to pay for all your security stuff. – MR

  5. Give me a Q and an A, and that spells FAIL? – They say it takes years to build credibility, and a minute to lose it. IBM is dealing with some of that in the security space after distributing infected USB sticks at a trade show. I don’t even know what to say. No one thought to actually test the batch of tchotchkes? Even if only to make sure the right content was there? But let’s not focus on the sheer idiocy of IBM here, but on what to do to protect yourself and your organization. First, turn off AutoRun – since that is how most USB stick malware will get executed. Second, don’t be a USB whore. Just because it’s shiny and has the logo of your favorite vendor doesn’t mean you should stick it in your machine. Have a little self-respect, will ya? Maybe the AV vendors (all of whom have detected this malware since 2008) can position themselves as the morning-after pill for promiscuous USB use. Or device control software can be positioned as a USB condom. Ah, the possibilities are endless. – MR

  6. That’s not a bus – it’s a steamroller – I talked with a client this week, who was struggling to maintain security controls while adopting cloud computing. They are moving to a hosted email system, but in the process may lose their DLP solution. It’s a problem they could have planned around, but decisions were made without security involved at the right level. Another client had a similar issue where they traded off their DLP so they could switch to cloud-based web content security. DanO over at Techdulla raises some similar issues as he reminds us that no matter what we think, security folks will likely be held responsible, even if the data has been shipped to the cloud. Steamrollers and buses are easy to dodge, but only if you keep your eyes open and spot them early enough. – RM

  7. Digging a grave for Brightmail? – We don’t publish a lot on anti-spam technologies, even though email and content security are core coverage areas for us. It is just not very interesting to discuss the meaningful differences between 99.2% and 99.6% effectiveness, or what it means when vendors swap positions from month to month depending upon the spam technique du jour. It is an unending cat-and-mouse game between spammers and new techniques to detect and block email spam, and things fluctuate very quickly. That said, every now and again we run across something interesting, such as Symantec Brightmail Gateway Decertified by ICSA labs, because they dropped below 97% effectiveness. It is actually big news when a major email security vendor “stops meeting one or more requirements, or is no longer in daily testing”. But as I have not seen an EOL announcement, it looks like this is a rather passive-aggressive way of notifying customers that they are moving away from supporting Brightmail anti-spam and moving forward with MessageLabs’ service based solution. Unless of course the Brightmail anti-spam guy team was on vacation this week, and they accidentally fell below 97% success, but I am betting this was at least half intentional. It’s still somewhat surprising, as I assumed there were still a handful of ASPs using the Brightmail engine. It will be interesting to see how Symantec covers this in press releases in the coming weeks. – AL

  8. Roland Garros it ain’t: the IT certification racket – I’ve ranted about the value of certifications a lot. But I never miss the opportunity to poke fun at the entire certification value chain. Case in point: Etherealmind’s observation about the joke of CCIE certification. Most vendor certifications fall into the same category. It’s about passing the test so the VAR can say they have X% staff certified, or the IT shop can show proficiency with their key vendors. Unfortunately most of the training isn’t designed to actually teach anything, it’s designed to get students past the test. As Rich said about security awareness training we need a no security professional left behind program to ensure folks doing things are actually competent. I know – details, details. But it won’t happen – as long as hiring managers focus on who has the paper rather than what they know, it’ll be the same old, same old. – MR

—Mike Rothman

Tuesday, May 25, 2010

Understanding and Selecting SIEM/LM: Data Collection

By Adrian Lane

The first four posts our the SIEM series dealt with understanding what SIEM is, and what problems it solves. Now we move into how to select the right product/solution/service for your organization, and that involves digging into the technology behind SIEM and log management platforms. We start with the foundation of every SIEM and Log Management platform: data collection. This is where we collect data from the dozens of different types of devices and applications we monitor. ‘Data’ has a pretty broad meaning – here it typically refers to event and log records but can also include flow records, configuration data, SQL queries, and any other type of standard data we want to pump into the platform for analysis.

It may sound easy, but being able to gather data from every hardware and software vendor on the planet in a scalable and reliable fashion is incredibly difficult. With over 20 vendors in the Log Management and SIEM space, and each vendor using different terms to differentiate their products, it gets very confusing. In this series we will define vendor-neutral terms to describe the technical underpinnings and components of log data collection, to level-set what you really need to worry about. In fact, while log files are what is commonly collected, we will use the term “data collection”, as we recommend gathering more than just log files.

Data Collection Overview

Conceptually, data collection is very simple: we just gather the events from different devices and applications on our network to understand what is going on. Each device generates an event each time something happens, and collects the events into a single repository known as a log file (although it could actually be a database). There are only four components to discuss for data collection, and each one provides a pretty straight-forward function. Here are the functional components:

Fig 1. Agent data collector

Fig 2. Direct connections to the device

Fig 3. Log file collection

  1. Source: There are many different sources – including applications, operating systems, firewalls, routers & switches, intrusion detection systems, access control software, and virtual machines – that generate data. We can even collect network traffic, either directly from the network for from routers that support Netflow-style feeds.
  2. Data: This is the artifact telling us what actually happened. The data could be an event, which is nothing more than a finite number of data elements to describe what happened. For example, this might record someone logging into the system or a service failure. Minimum event data includes the network address, port number, device/host name, service type, operation being performed, result of the operation (success or error code), user who performed the operation, and timestamp. Or the data might just be configuration information or device status. In practice, event logs are pretty consistent across different sources – they all provide this basic information. But each offers additional data, including context. Additional data types may include things such as NetFlow records and configuration files. In practice, most of the data gathered will be events and logs, but we don’t want to arbitrarily restrict our scope.
  3. Collector: This connects to a source device, directly or indirectly, to collect the events. Collectors take different forms: they can be agents residing on the source device (Fig. 1), remote code communicating over the network directly with the device (Fig. 2), an agent writing code writing to a dedicated log repository (Fig. 3), or receivers accepting a log file stream. A collector may be provided by the SIEM vendor or a third party (normally the vendor of the device being monitored). Further, the collector functions differently, depending upon the idiosyncrasies of the device. In most cases the source need only be configured once, and events will be pushed directly to the collector or into a neutral log file read by it. In some cases, the collector must continually request data be sent, polling the source at regular intervals.
  4. Protocol: This is how collector communicates with the source. This is an oversimplification, of course, but think of it as a language or dialect the two agree upon for communicating events. Unfortunately there are lots of them! Sometimes the collector uses an API to communicate directly with the source (e.g., OPSEC LEA APIs, MS WMI, RPC, or SDEE). Sometimes events are streamed over networking protocols such as SNMP, Netflow, or IPFIX. Sometimes the source drops events into a common file/record format, such as syslog, Windows Event Log, or syslog-ng, which is then read by the collector. Additionally, third party applications such as Lasso and Snare provide these features as a service.

Data collection is conceptually simple, but the thousands of potential variations makes implementation a complex mess. It resembles a United Nations meeting: you have a whole bunch of people talking in different languages, each with a particular agenda of items they feel are important, and different ways they want to communicate information. Some are loquacious and won’t shut up, while others need to be poked and prodded just to extract the simplest information. In a nutshell, it’s up to the SIEM and Log Management platforms to act as the interpreters, gathering the information and putting it into some useful form.


Each model for data collection has trade-offs. Agents can be a powerful proxy, allowing the SIEM platform to use robust (sometimes proprietary) connection protocols to safely and reliably move information off devices; in this scenario device setup and configuration is handled during agent installation. Agents can also take full advantage of native device features, and can tune and filter the event stream. But agents have fallen out of favor somewhat. SIEM installations cover thousands of devices, which means agents can be a maintenance nightmare, requiring considerable time to install and maintain. Further, agents’ processing and data storage requirements on the device can affect stability and performance. Finally, most agents require administrative access, which creates am additional security concern on each device.

Another common technique streams events to log files, such as syslog or the Windows Event Log. These may reside on the device, streamed to another server, or sent directly to the log management system. The benefit of this method is that data arrives already formatted using a common protocol and layout. Further, if the events are collected in a file, this removes concerns about synchronization issues and uncollected events lost prior to collection – both problems when working directly with some devices. Unfortunately general-purpose logging systems require some data normalization, which can lose detail.

Some older devices, especially dedicated control systems, simply do not offer full-feature logging, and require API-level integration to collect events. These specialized devices are much more difficult to work with, and require dedicated full-time connections to collect event trails, creating both a maintenance nightmare and a performance penalty on the devices. In these cases you do not have a choice, but need a synchronous connection in order to capture events.

Understand that data collection is not an either/or proposition. Depending on the breadth of your monitoring efforts, you may need to use every technique on some subset of device types and applications. Go into the project with your eyes open, recognizing the different types of collection, and the associated nuances and complexity of each.

In the next post we’ll talk about what to do with all this collected data: prepare it for analysis, which means normalization.

—Adrian Lane

A Phish Called Tabby

By Mike Rothman

Thanks to Aza Raskin, this week we learned of a new phishing attack, dubbed “tabnabbing” by Brian Krebs. It opening a tab (unbeknownst to the user), changes the favicon, and does a great job of impersonating a web page – or a bank account, or any other phishing target. Through the magic of JavaScript, the tabs can be controlled and the attack made very hard to detect since it preys on the familiarity of users with common webmail and banking interfaces.

So what do you do? You can run NoScript in your Firefox browser and to prevent the JavaScript from running (unless you idiotically allowed JavaScript on a compromised page). Another option is leveraging a password manager. Both Rich and I have professed our love for 1Password on the Mac. 1Password puts a button in your browser, and when logging in brings up a choice of credentials for that specific domain to automatically fill in the form. So when I go to Gmail, logging in is as easy as choosing one of the 4 separate logins I use on google.com domains.

Now if I navigate to the phishing site, which looks exactly like Gmail, I’d still be protected. 1Password would not show me any stored logins for that domain, since presumably the phisher must use a different domain. This isn’t foolproof because the phisher could compromise the main domain, host the page there, and then I’m hosed. I could also manually open up 1Password and copy/paste the login credentials, but that’s pretty unlikely. I’d instantly know something was funky if my logins were not accessible, and I’d investigate. Both of these scenarios are edge cases and I believe in a majority of situations I’d be protected.

I’m not familiar with password managers on Windows, but if they have similar capabilities, we highly recommend you use one. So not only can I use an extremely long password on each sensitive site, I get some phishing protection as a bonus. Nice.

—Mike Rothman

Thoughts on Diversity and False Diversity

By Rich

Mike Bailey highlights a key problem with web applications in his post on diversity. Having dealt with these issues as a web developer (a long time ago), I want to add a little color.

We tend to talk about diversity as being good, usually with biological models and discussions of monoculture. I think Dan Geer was the first to call out the dangers of using only a single computing platform, since one exploit then has the capability of taking down your entire organization.

But the heterogeneous/homogenous tradeoffs aren’t so simple. Diversity reduces the risk of a catastrophic single point of failure by increasing the attack surface and potential points of failure.

Limited diversity is good for something like desktop operating systems. A little platform diversity can keep you running when something very bad hits the primary platform and takes those systems down. The trade off is that you now have multiple profiles to protect, with a great number of total potential vulnerabilities. For example, the Air Force standardized their Windows platforms to reduce patching costs and time.

What we need, on the OS side, is limited diversity. A few standard platform profiles that strike the balance between reducing the risk that a single problem will take us completely down, while maintaining manageability through standardization.

But back to Mike’s post and web applications…

With web applications what we mostly see is false diversity. The application itself is a monolithic entity, but use of multiple frameworks and components only increases the potential attack surface.

With desktop operating systems, diversity means a hole in one won’t take them all down. With web applications, use of multiple languages/frameworks and even platforms increases the number of potential vulnerabilities, since exploitation of any one of those components can generally take down/expose the entire application.

When I used to develop apps, like every web developer at the time, I would often use a hodgepodge of different languages, components, widgets, etc. Security wasn’t the same problem then it is now, but early on I learned that the more different things I used, the harder it was to maintain my app over time. So I tended towards standardization as much as possible. We’re doing the same thing with our sooper sekret project here at Securosis – sticking to as few base components as we can, which we will then secure as well as we can.

What Mike really brings to the table is the concept of how to create real diversity within web applications, as opposed to false diversity. Read his post, which includes things like centralized security services and application boundaries. Since with web applications we don’t control the presentation layer (the web browser, which is a ‘standard’ client designed to accept input from nearly anything out there), new and interesting boundary issues are introduced – like XSS and CSRF.

Adrian and I talk about this when we advise clients to separate out encryption from both the application and the database, or use tokenization. Those architectures increase diversity and boundaries, but that’s very different than using 8 languages and widgets to build your web app.


Monday, May 24, 2010

DB Quant: Discovery And Assessment Metrics (Part 1) Enumerate Databases

By Adrian Lane

Now that we’ve finished detailing the general planning metrics, it’s time to move on to measuring the costs associated with discovery and assessment. First on that list is enumeration of databases, which in layman’s terms means finding all the databases you need to secure. Unlike the Planning phase, the majority of these tasks are technical, so the work be performed IT and DBA groups. Larger organizations commonly rely upon tools to automate discovery and assessment, so we will include these as optional costs for the analysis.

As a reminder, the process we specified is as follows:

  1. Plan
  2. Setup
  3. Enumerate
  4. Document


Variable Notes
Time to define scope and requirements What databases/networks are in scope; what information to collect
Time to identify supporting tools to automate discovery
Time to identify business units & network staff Who owns the resources and provides information
Time to map domains and schedule scans


Variable Notes
Capital and time costs to acquire tools for discovery automation Optional
Time to contact business units & network staff
Time to configure discovery tool Optional
Time to contact database owners and obtain credentials and access As needed, depending on the tool and process selected


Variable Notes
Time to run active scan
Time to manually discover databases Optional, if automated tool not used. This could be a technical process, or manual contact with business units
Time to run scan/passive scan Automated port scan; network flow analysis
Time to contact business units ID databases discovered
Time to manually login, confirm scan, & filter results Optional
Time to repeat steps As needed


Variable Notes
Time to save scan results
Time to generate report(s)
Time to generate baseline of databases for future comparisons Cataloged by type, version, location, and ownership

Other Posts in Project Quant for Database Security

  1. An Open Metrics Model for Database Security: Project Quant for Databases
  2. Database Security: Process Framework
  3. Database Security: Planning
  4. Database Security: Planning, Part 2
  5. Database Security: Discover and Assess Databases, Apps, Data
  6. Database Security: Patch
  7. Database Security: Configure
  8. Database Security: Restrict Access
  9. Database Security: Shield
  10. Database Security: Database Activity Monitoring
  11. Database Security: Audit
  12. Database Security: Database Activity Blocking
  13. Database Security: Encryption
  14. Database Security: Data Masking
  15. Database Security: Web App Firewalls
  16. Database Security: Configuration Management
  17. Database Security: Patch Management
  18. Database Security: Change Management
  19. DB Quant: Planning Metrics, Part 1
  20. DB Quant: Planning Metrics, Part 2
  21. DB Quant: Planning Metrics, Part 3
  22. DB Quant: Planning Metrics, Part 4

—Adrian Lane

FireStarter: The Only Value/Loss Metric That Matters

By Rich

As some of you know, I’ve always been pretty critical of quantitative risk frameworks for information security, especially the Annualized Loss Expectancy (ALE) model taught in most of the infosec books. It isn’t that I think quantitative is bad, or that qualitative is always materially better, but I’m not a fan of funny math.

Let’s take ALE. The key to the model is that your annual predicted losses are the losses from a single event, times the annual rate of occurrence. This works well for some areas, such as shrinkage and laptop losses, but is worthless for most of information security. Why? Because we don’t have any way to measure the value of information assets.

Oh, sure, there are plenty of models out there that fake their way through this, but I’ve never seen one that is consistent, accurate, and measurable. The closest we get is Lindstrom’s Razor, which states that the value of an asset is at least as great as the cost of the defenses you place around it. (I consider that an implied or assumed value, which may bear no correlation to the real value).

I’m really only asking for one thing out of a valuation/loss model:

The losses predicted by a risk model before an incident should equal, within a reasonable tolerance, those experienced after an incident.

In other words, if you state that X asset has $Y value, when you experience a breach or incident involving X, you should experience $Y + (response costs) losses. I added, “within a reasonable tolerance” since I don’t think we need complete accuracy, but we should at least be in the ballpark. You’ll notice this also means we need a framework, process, and metrics to accurately measure losses after an incident.

If someone comes into my home and steals my TV, I know how much it costs to replace it. If they take a work of art, maybe there’s an insurance value or similar investment/replacement cost (likely based on what I paid for it). If they steal all my family photos? Priceless – since they are impossible to replace and I can’t put a dollar sign on their personal value. What if they come in and make a copy of my TV, but don’t steal it? Er… Umm… Ugh.

I don’t think this is an unreasonable position, but I have yet to see a risk framework with a value/loss model that meets this basic requirement for information assets.


Friday, May 21, 2010

The Secerno Technology

By Adrian Lane

I ran long on yesterday’s Oracle Buys Secerno, but it is worth diving into Secerno’s technology to understand why this is a good fit for Oracle. I get a lot of questions about Secerno product, from customers unclear how the technology works. Even other database activity monitoring vendors ask – some because they want to know what the product is really capable of, others who merely want to vent their frustration at me for calling Secerno unique. And make no mistake – Secerno is unique, despite competitor claims to the contrary.

Unlike every other vendor in the market, Secerno analyzes the SQL query construct. They profile valid queries, and accept only queries that have the right structure. This is not content monitoring, not traditional behavioral monitoring, not context monitoring, and not even attribute-based monitoring, but looking at the the query language itself.

Consider that any SQL query (e.g., SELECT, INSERT, UPDATE, CREATE, etc.) has dozens of different options, allowing hundreds of variations. You can build very complex logic, including embedding other queries and special characters. Consider an Oracle INSERT operation as an example. The (pseudo) code might look like:

INSERT INTO Table.Column

Or it might look like …

INSERT INTO User.Table.@db_Link ColumnA, ColumnC
VALUE 'XYZ', 'PDQ' | SELECT * FROM SomeSystemTable ...
WHERE 1=1;

We may think of INSERT as a simple statement, but there are variations which are not simple at all. Actually they get quite complex, and enable me to all sorts of stuff to confuse the query parser into performing operations on my behalf. There are ample opportunities for me to monkey with the WHERE clause, embed logic or reference other objects.

Secerno handles this by mapping every possible SQL query variation for the database platform it is protecting, but depending upon the application, only allows a small subset of known variations to be accepted. Everything else can be blocked. In the examples above, the first would be permitted while the latter blocked. Attackers commonly abuse query syntax to confuse the database query parser into doing something it is not supposed to do. The more obscure uses of the SQL query language are ripe targets for abuse. In essence you remove a lot of the possible attacks because you simply do not allow unacceptable query structures or variations. This is a different way to define acceptable use of the database.

Secerno calls this a “Database Firewall”, which helps the general IT audience quickly get the concept, but I call this technology query White Listing, as it is a bit more accurate. Pick the acceptable queries and their variations, and block everything else. And it can ‘learn’ by looking at what the application sends the database – and if my memory serves me, can even learn appropriate parameters as well. It’s less about context and content, and more about form. Other vendors offer blocking and advertise “Database Firewall” capabilities. Some sit in front of the database like Secerno does, and others reside on the database platform. The real difference is not whether or not they block, but in how they detect what to block.

As with any technology, there are limitations. If Secerno is used to block queries, it can create a performance bottleneck. Similarly to a network firewall, more rules means more checking. You can quickly build a very detailed rule set that creates a performance problem. You need to balance the number of rules with performance. And just like a firewall or WAF, if your application changes queries on a regular basis, your rule set will need to adapt to avoid breaking the application.

The real question is “Is this technology better?” The answer depends upon usage. For detection of insider misuse, data privacy violation, or hijacked accounts, either stateful inspection and behavioral monitoring will be a better choice. For databases that support a lot of ad hoc activity, content inspection is better. But for web applications, especially those that don’t add/change their database queries very often, this query analysis method is very effective for blocking injection attacks. Over and above the analysis capabilities, the handful of customers I have spoken with deployed the platform very quickly. And from the demos I have seen, the product’s interface is on par with the rest of the DAM providers.

Secerno is not revolutionary and does not offer extraordinary advantages over the competition. It is a good technology and a very good fit for Oracle, because it fills the gaps they in their security portfolio. Just keep in mind that each Database Activity Monitoring solution offers a different subset of available analysis techniques, deployment models, and supporting technologies – such as WAF, Assessment and Auditing. And each vendor provides a very different experience – in terms of user interface quality, ease of management, and deployment. DAM is a powerful tool for your arsenal, but you need to consider the whole picture – not just specific analysis techniques.

—Adrian Lane

The Laziest Phisher in the World

By Rich

I seriously got this last night and just had to share. It’s the digital equivalent of sending someone a letter that says, “Hello, this is a robber. Please put all your money in a self addressed stamped envelope and mail it to…”

Dear Valued Member,

Due to the congestion in all Webmail account and removal of all unused
Accounts,we would be shutting down all unused accounts, You will have to
confirm your E-mail by filling out your Login Info below after clicking
the reply botton, or your account will be suspended within 48 hours for
security reasons.

UserName: ..........................................
Date Of Birth: .....................................
Country Or Territory:...............................

After Following the instructions in the sheet,your account will not be
interrupted and will continue as normal.Thanks for your attention to this

We apologize for any inconvinience.

Webmaster Case number: 447045727401
Property: Account Security


Friday Summary: May 21, 2010

By Rich

For a while now I’ve been lamenting the decline in security blogging. In talking with other friends/associates, I learned I wasn’t the only one. So I finally got off my rear and put together a post in an effort to try kickstarting the community. I don’t know if the momentum will last, but it seems to have gotten a few people back on the wagon.

Alan Shimel reports he’s had about a dozen new people join the Security Blogger’s Network since my post (although in that post he only lists the first three, since it’s a couple days old). We’ve also had some old friends jump back into the fray, such as Andy the IT Guy, DanO, LoverVamp, and Martin.

One issue Alan and I talked about on the phone this week is that since Technorati dropped the feature, there’s no good source to see everyone who is linking to you. The old pingbacks system seems broken. If anyone knows of a good site/service, please let us know. Alan and I are also exploring getting something built to better interconnect the SBN. It’s hard to have a good blog war when you have to Tweet at your opponent so they know they’re under attack.

Another issue was highlighted by Ben Tomhave. A lot of people are burnt out, whether due to the economy, their day jobs, or general malaise and disenchantment with the industry. I can’t argue too much with his point, since he’s not the only semi-depressed person in our profession. But depression is a snowballing disorder, and maybe if we can bring back some energy people will get motivated again.

Anyway, I’m psyched to see the community gearing back up. I won’t take it for granted, and who knows if it will last, but I for one really hope we can set the clock back and party like it’s 2007.

On to the Summary:

Webcasts, Podcasts, Outside Writing, and Conferences

Favorite Securosis Posts

Other Securosis Posts

Favorite Outside Posts

Project Quant Posts

Research Reports and Presentations

Top News and Posts

Blog Comment of the Week

Remember, for every comment selected, Securosis makes a $25 donation to Hackers for Charity. This week’s best comment goes to Pablo, in response to How to Survey Data Security Outcomes?

In terms of control effectiveness, I would suggest to incorporate another section aside from ‘number of incidents’ where you question around unknowns and things they sense are all over the place but have not way of knowing/controlling.

I’ll break out my comment in two parts: 1 – “philosophical remarks” and 2 – suggestions on how to implement that in your survey

1 – “philosophical remarks”

If you think about it, effectiveness is the ability to illustrate/detect risks and prevent bad things from happening. So, in theory, we could think of it as a ratio of “bad things understood/detected” over “all existing bad things that are going on or could go on” (by ‘bad things’ I mean sensitive data being sent to wrong places/people, being left unprotected, etc. – with ‘wrong/bad’ being a highly subjective concept)

So in order to have a good measure of effectiveness we need both the ‘numerator’ (which ties to your question on ‘number of incidents’) and also a ‘denominator’

The ‘denominator’ could be hard to get at, because, again, things are highly subjective, and what constitutes ‘sensitive’ changes in the view of not only the security folks, but more importantly, the business. (BTW, I have a slight suggestion on your categories that I include at the bottom of this post)

However, I believe it is important that we get a sense of this ‘denominator’ or at least the perception of this ‘denominator’. My own personal opinion on this, by speaking to select CISOs is they feel things are ‘all over the place’ (i.e., the denominator is quite quite large).

2 – Suggestions on how to implement that in your survey

(We had to cut this quote for space, but they were great, practical suggestions – see the full comment at the original post).


Thursday, May 20, 2010

Quick Wins with DLP Webcast Next Week

By Rich

Next week I will be giving a webcast to complement my Quick Wins with Data Loss Prevention paper. This is a bit different than when I usually talk about DLP – it’s focused on showing immediate value, while also positioning for long term success.

Like the paper it’s sponsored by McAfee. We’re holding it at 11am PT on May 25, and you can register by clicking here.

Here’s the full description:

Quick Wins with DLP – How to Make DLP Work for You
Date: May 25, 2010
Time: 11am PDT / 2pm EDT

When used properly, Data Loss Prevention (DLP) provides rapid identification and assessment of data security issues not available with any other technology. However, when not optimized, two common criticisms of DLP are 1) its complexity and 2) the fear of false positives. Security professionals often worry that DLP is expensive and will fail to deliver the expected value.

A little knowledge and some planning go a long way towards a fast, simple, and effective deployment. By taking some straightforward best practice steps, you can realize significant immediate value and security gains without negatively impacting your productivity or wasting valuable resources.

In this webcast you will learn how to:

  • Establish a flexible incident management process
  • Integrate with major infrastructure components
  • Assess broad information usage
  • Set a foundation for future focused efforts and policy tuning

You will also hear how Continuum Health Partners safeguards highly sensitive patient data with McAfee DLP 9. Join us for this informative presentation.


  • Rich Mogull, Analyst & CEO, Securosis, LLC
  • Mark Moroses, Assistant CIO, Continuum Health Partners
  • John Dasher, Senior Director, Data Protection, McAfee


Privacy Is (Still) Personal

By Rich

I want to respond to something Adam wrote about Facebook over at Emergent Chaos, but first I’m going to excerpt my own article from TidBITS:

Privacy is Personal – In the Information Age, determining what you want others to know about you isn’t always a simple decision. Aside from the potential tradeoffs of avoiding particular features or services, we all have different thresholds for what we are comfortable sharing. It’s also extremely difficult to control our information even when we do make informed decisions, and often impossible to eradicate information that escaped our control before we realized the rules of the game had changed.

For example, I use both Amazon and Netflix, even though those services also collect personal information like my buying and viewing habits. I am trading my data (and money) for a combination of convenience and personalization. I’m less concerned with these services than Facebook since their privacy practices and policies are clearer, my information is compartmentalized within each service, and they have much more consistent and stable records.

On the other hand I have minimized my usage of Google services due to privacy concerns. Google’s reach is incredibly expansive, and despite their addition of Google Dashboard to help show some of what they record, and much clearer policies than Facebook, I’m generally uncomfortable with any single company or government having that much potential information on me. I fully understand this is a somewhat emotional response.

Facebook is building a similar Internet-wide ecosystem as they expand connections to external Web sites and services. In exchange for allowing them access to your information and activities, Facebook enables new kinds of services and personalization. The question each of us must answer is if those new services and personalization options are worth the privacy tradeoff.

Deciding where to draw your own privacy lines is a very personal, complex, and even sometimes arbitrary decision. I trust Amazon and Netflix to a certain extent based on their privacy policies, even though they sometimes make mistakes (I didn’t use Amazon for years after a policy change that they later reversed). Yet I’ve limited my usage of both Google and Facebook due to general concerns (Google) or outright distrust (Facebook).

Facebook, to me, is a tool to keep me connected to friends and family I don’t interact with on a daily basis. I restrict what information it has on me, and always assume anything I do on Facebook could be public. I’m willing to trade a little privacy for the convenience of being able to stay connected with an expanded social circle. I manage Facebook privacy by not using it for anything that’s actually private.

Adam has a lot in his article, and I think his criticisms of my original post come down to:

  1. Your perceptions of your own privacy change within different contexts and over time, so what you are okay with today may not be acceptable tomorrow.
  2. If you only use the service to post things you’d want public anyway, why use it at all?

I completely agree with Adam’s first point – what you share when you are 19 years old at college is very different than what you might want people to know about you once you are 35. Even things you might share at 35 as a member of the workforce might come back to haunt you when you are 55 and running for political office.

But I disagree that this means your only option is to completely opt out of all centralized social media services. I believe we as society are reaching the point where some degree of social networking is the norm. Even “private” communications like email, IM, and SMS are open to potential disclosure and subsequent inclusion in public search results. The same used to be true of the written and spoken word, but clearly the scale and scope are dramatically larger in the Information Age. We are losing the insular layers that created our current social norms of privacy – which already vary around the world.

The last time society needed to adapt to such changes in privacy was with the Industrial Age and movement from rural to urban society. Before that, it was probably the change from hunter/gatherers to an agrarian society.

I see three possible scenarios that could develop:

  1. Society adopts a combination of laws and social mores to better protect privacy. It will be expected that you own your own data, and in the future retain a right to edit your past. Essentially, we work to protect our current expectations of privacy – which will require active effort, as the terrain has already shifted under us, and will continue to do so.
  2. Social expectations change. You’ll be able to run for political office and no one will care that you called some chick or dude hot and joined the “I love some stupid emo vampire” movement. We gain better abilities to protect our privacy, but at the same time society becomes more accepting of greater personal information being public – partially through sheer boredom at the inanity and popularity of our embarrassing peccadilloes.
  3. There is no privacy.

We have many years before these issues resolve, if ever, and it’s going to be a rough road no matter where we are headed. The end result probably won’t match any of my scenarios, but will instead be some mish-mash of those options and others I haven’t thought of. My rough guess is that society will slowly become more accepting of youthful indiscretions (or we won’t have anyone to hire or elect), but we will also gain more control over our personal information.

Privacy isn’t dead, but it is definitely changing. We all need to make personal decisions about the level of risk we are willing to accept in the midst of changing social norms, government/business influence, and degrees of control.