Pragmatic Data Security: Discover

In the Discovery phase we figure where the heck our sensitive information is, how it’s being used, and how well it’s protected. If performed manually, or with too broad an approach, Discovery can be quite difficult and time consuming. In the pragmatic approach we stick with a very narrow scope and leverage automation for greater efficiency. A mid-sized organization can see immediate benefits in a matter of weeks to months, and usually finish a comprehensive review (including all endpoints) within a year or less. Discover: The Process Before we get into the process, be aware that your job will be infinitely harder if you don’t have a reasonably up to date directory infrastructure. If you can’t figure out your users, groups, and roles, it will be much harder to identify misuse of data or build enforcement policies. Take the time to clean up your directory before you start scanning and filtering for content. Also, the odds are very high that you will find something that requires disciplinary action. Make sure you have a process in place to handle policy violations, and work with HR and Legal before you start finding things that will get someone fired (trust me, those odds are pretty darn high). You have a couple choices for where to start – depending on your goals, you can begin with applications/databases, storage repositories (including endpoints), or the network. If you are dealing with something like PCI, stored data is usually the best place to start, since avoiding unencrypted card numbers on storage is an explicit requirement. For HIPAA, you might want to start on the network since most of the violations in organizations I talk to relate to policy violations over email/web/FTP due to bad business processes. For each area, here’s how you do it: Storage and Endpoints: Unless you have a heck of a lot of bodies, you will need a Data Loss Prevention tool with content discovery capabilities (I mention a few alternatives in the Tools section, but DLP is your best choice). Build a policy based on the content definition you built in the first phase. Remember, stick to a single data/content type to start. Unless you are in a smaller organization and plan on scanning everything, you need to identify your initial target range – typically major repositories or endpoints grouped by business unit. Don’t pick something too broad or you might end up with too many results to do anything with. Also, you’ll need some sort of access to the server – either by installing an agent or through access to a file share. Once you get your first results, tune your policy as needed and start expanding your scope to scan more systems. Network: Again, a DLP tool is your friend here, although unlike with content discovery you have more options to leverage other tools for some sort of basic analysis. They won’t be nearly as effective, and I really suggest using the right tool for the job. Put your network tool in monitoring mode and build a policy to generate alerts using the same data definition we talked about when scanning storage. You might focus on just a few key channels to start – such as email, web, and FTP; with a narrow IP range/subnet if you are in a larger organization. This will give you a good idea of how your data is being used, identify some bad business process (like unencrypted FTP to a partner), and which users or departments are the worst abusers. Based on your initial results you’ll tune your policy as needed. Right now our goal is to figure out where we have problems – we will get to fixing them in a different phase. Applications & Databases: Your goal is to determine which applications and databases have sensitive data, and you have a few different approaches to choose from. This is the part of the process where a manual effort can be somewhat effective, although it’s not as comprehensive as using automated tools. Simply reach out to different business units, especially the application support and database management teams, to create an inventory. Don’t ask them which systems have sensitive data, ask them for an inventory of all systems. The odds are very high your data is stored in places you don’t expect, so to check these systems perform a flat file dump and scan the output with a pattern matching tool. If you have the budget, I suggest using a database discovery tool – preferably one with built in content discovery (there aren’t many on the market, as we’ll mention in the Tools section). Depending on the tool you use, it will either sniff the network for database connections and then identify those systems, or scan based on IP ranges. If the tool includes content discovery, you’ll usually give it some level of administrative access to scan the internal database structures. I just presented a lot of options, but remember we are taking the pragmatic approach. I don’t expect you to try all this at once – pick one area, with a narrow scope, knowing you will expand later. Focus on wherever you think you might have the greatest initial impact, or where you have known problems. I’m not an idealist – some of this is hard work and takes time, but it isn’t an endless process and you will have a positive impact. We aren’t necessarily done once we figure out where the data is – for approved repositories, I really recommend you also re-check their security. Run at least a basic vulnerability scan, and for bigger repositories I recommend a focused penetration test. (Of course, if you already know it’s insecure you probably don’t need to beat the dead horse with another check). Later, in the Secure phase, we’ll need to lock down the approved repositories so it’s important to know which security holes to plug. Discover: Technologies Unlike the Define phase, here we have a plethora of options. I’ll break this into

Read Post

FireStarter: Agile Development and Security

I am a big fan of the Agile project development methodology, especially Agile with Scrum. I love the granularity and focus the approach requires. I love that at any given point in time you are working on the most important feature or function. I love the derivative value of communication and subtle form of peer pressure that Scrum meetings produce. I love that if mistakes are made you do not go too far in the wrong direction, resulting in higher productivity and few software projects that are total disasters. I think Agile is the biggest advancement in code development in the last decade as it addresses issues of complexity, scalability, focus and bureaucratic overhead. But it comes with one huge caveat: Agile hurts secure code development. There, I said it. Someone had to. The Agile process, and even the scrum leadership model, hamstrings development in the area of building secure products. Security is not a freakin’ task card. Logic flaws are not well documented, discreet tasks to be assigned. Project managers (and unfortunately most ScrumMasters) learned security by skimming a ‘For Dummies’ book at Barnes & Noble while waiting for their lattes, but these are the folks making the choices as to what security should make it into the iterations. Just like general IT security, we end up wrapping the Agile process in a security blanket or bolting on security after the code is complete, because the process as we know it is not well suited to secure development. I know there will be several of you out there who saying “Prove it! Show us a study or research evidence that supports your theory.” I can’t. I don’t have meaningful statistical data to back up my claim. But that does not mean it’s not true, and there is anecdotal evidence to support what I am saying. For example: The average Sprint duration of two weeks is simply too short for meaningful security testing. Fuzzing & black box testing are infeasible with nightly builds or pre-release sanity checks. Trust assumptions between code modules or system functions where multiple modules process requests cannot be fully exercised and tested within the Agile timeline. White box testing can be effective, but face it – security assessments don’t fit into neat 4-8 hour windows. In the same way Agile products deviate from design and architecture specifications, they deviate from systemic analysis of trust and code dependancies. It’s a classic forest through the trees problem: efficiency and focus gained by skipping over big picture details necessarily come at the expense of understanding how the system and data are used as a whole. Agile’s great at dividing and conquering what you know, but not so great for dealing with the abstract. Secure code development is not like fixing bugs where you have a stack trace to follow. Secure code development is more about coding principles that lead to better security. In the same way Agile can’t help enforce code ‘style’, it won’t help with secure coding guidelines. (Secure) style verification is an advantage of peer programming and inherent in code reviews, but not intrinsic to Agile. The person on the Scrum team with the least knowledge of security, the Product Manager, prioritizes what gets done. Project managers as a general guideline don’t track security testing, and they are not incented to get security right. They are incented to get the software over the finish line. If they track bugs on the product backlog, they probably have a task card buried somewhere, but don’t understand the threats. Security personnel are chickens in the project and do not gate code acceptance they way they traditionally were able to do in waterfall testing, and may have limited exposure to developers. The fact that major software development organizations are modifying or wrapping Agile with other frameworks to compensate for security is evidence of the difficulties in applying security practices directly. The forms of testing that fit within Agile are more likely to get done. If they don’t fit, they are usually skipped (especially at crunch time), or they have to be scheduled outside the development cycle. It’s not just that the granular focus on tasks makes it harder to verify security at the code and system levels. It’s not just that the features are the focus, or that the wrong person is making security decisions. It’s not just that the quick turnaround in code production precludes some forms of testing known to be effective at identifying security issues. It’s not just that it’s hard to bucket security into discreet tasks. It’s all that and more. We’re not going to see a study that compares Waterfall with Agile for security benefits. Putting together similar development teams to create similar products under two development methodologies to prove this point is not practical. I have run Agile and Waterfall projects of a similar nature in parallel, and while Agile had overwhelming advantages in a number of areas, security was not one of them. If you are moving to Agile, great – but you will need to evolve your Agile process to accomodate security. What do you think? How have you successfully integrated secure coding practices with Agile? This is a FireStarter, so discuss in the comments. Share:

Read Post

You Have to Buy Data Security Tools

When Mike was reviewing the latest Pragmatic Data Security post he nailed me on being too apologetic for telling people they need to spend money on data-security specific tools. (The line isn’t in the published post). Just so you don’t think Mike treats me any nicer in private than he does in public, here’s what he said: Don’t apologize for the fact that data discovery needs tools. It is what it is. They can be like almost everyone else and do nothing, or they can get some tools to do the job. Now helping to determine which tools they need (which you do later in the post) is a good thing. I just don’t like the apologetic tone. As someone who is often a proponent for tools that aren’t in the typical security arsenal, I’ve found myself apologizing for telling people to spend money. Partially, it’s because it isn’t my money… and I think analysts all too often forget that real people have budget constraints. Partially it’s because certain users complain or look at me like I’m an idiot for recommending something like DLP. I have a new answer next time someone asks me if there’s a free tool to replace whatever data security tool I recommend: Did you build your own Linux box running ipfw to protect your network, or did you buy a firewall? The important part is that I only recommend these purchases when they will provide you with clear value in terms of improving your security over alternatives. Yep, this is going to stay a tough sell until some regulation or PCI-like standard requires them. Thus I’m saying, here and now, that if you need to protect data you likely need DLP (the real thing, not merely a feature of some other product) and Database Activity Monitoring. I haven’t found any reasonable alternatives that provide the same value. There. I said it. No more apologies – if you have the need, spend the money. Just make sure you really have the need, and the tool you are looking at really delivers the value, since not all solutions are created equal. Share:

Read Post

Network Security Fundamentals: Monitor Everything

As we continue on our journey through the fundamentals of network security, the idea of network monitoring must be integral to any discussion. Why? Because we don’t know where the next attack is coming, so we need to get better at compressing the window between successful attack and detection, which then drives remediation activities. It’s a concept I coined back at Security Incite in 2006 called React Faster, which Rich subsequently improved upon by advocating Reacting Faster and Better. React Faster (and better) I’ve written extensively on the concept of React Faster, so here’s a quick description I penned back in 2008 as part of an analysis of Security Management Platforms, which hits the nail on the head. New attacks are happening at a fast and furious pace. It is a fool’s errand to spend time trying to anticipate where the issues are. REACT FASTER first acknowledges that all attacks cannot be stopped. Thus, focus remains on understanding typical traffic and application usage trends and monitoring for anomalous behavior, which could indicate an attack. By focusing on detecting attacks earlier and minimizing damage, security professionals both streamline their activities and improve their effectiveness. Rich’s corollary made the point that it’s not enough to just react faster, but you need to have a plan for how to react: Don’t just react – have a response plan with specific steps you don’t jump over until they’re complete. Take the most critical thing first, fix it, move to the next, and so on until you’re done. Evaluate, prioritize, contain, fix, and clean. So monitoring done well compresses the time between compromise and detection, and also accelerates root cause analysis to determine what the response should involve. Network Security Data Sources It’s hard to argue with the concept of reacting faster and collecting data to facilitate that activity. But with an infinite amount of data to collect, where do we start? What do we collect? How much of it? For how long? All of these are reasonable questions that need answers as you construct your network monitoring strategy. The major data sources from your network security infrastructure include: Firewall: Every monitoring strategy needs to correspond to the most prevalent attack vectors, and that means from the outside in. Yes, the insider threat is real, but script kiddies are alive and well and that means we need to start by looking at our Internet-facing devices. First we pull log and activity information from our firewalls and UTM devices on the perimeter. We look for strange patterns, which usually indicate something is wrong. We want to keep this data long enough to ensure we have sufficient data in the event of a well-executed low and slow attack, which means months rather than days. IPS: The next layer in tends to be IPS, looking for patterns of traffic that indicate a known attack. We want the alerts first and foremost. But we also want to collect the raw IPS logs as well. Just because the IPS doesn’t think specific traffic is an attack doesn’t mean it isn’t. It could be a dreaded 0-day, so we want to pull all the data we can off this box as well, since the forensic analysis can pinpoint when attacks first surfaced and also provide guidance as to the extent of the compromise. Vulnerability scans: Are those devices vulnerable to a specific attack? Vulnerability scan data is one of the key inputs to SIEM/correlation products. The best way to reduce false positives is not to fire an alert if the target is not vulnerable. Thus we keep scan data on hand, and use it both for real-time analysis and also forensics. If an attack happens during a window of vulnerability (like while you debate the merits of a certain patch with the ops guys), you need to know that. Network Flow Data: I’ve always been a big fan of network flow analysis and continue to be mystified that market never took off, given the usefulness of understanding how traffic flows within and out of a network. All is not lost, since a number of security management products use flow data in their analyses and a few lower end management products use flow data as well. Each flow record is small, so there is no reason not to keep a lot of it. Again, we use this data to both pinpoint potential badness, and also replay attacks to understand how they spread within the organization. Device Change Logs: If your network devices get compromised, it’s pretty much game over. Traffic can be redirected, logging suppressed, and lots of other badness can result. So keep track of device configuration and more importantly when those changes happen – which helps isolate the root causes of breaches. Yes, if the logs are turned off, you lose visibility, which can itself indicate an issue. Through the wonders of SNMP, you should collect data from all your routers, switches, and other pipes. Content security: Now we can climb the stack a bit to pull information off the content security gateways, since a lot of attacks still show up via phishing emails and malware-laden web links. Again, we aren’t trying to pull this data in necessarily to stop an attack (hopefully the anti-spam box will figure out you aren’t interested in the little blue pill), but rather to gather more information about the attack vectors and how an attack proliferates through your environment. Reacting faster is about learning all we can about what is compromised and responding in the most efficient and effective manner. Keeping things focused and pragmatic, you’d like to gather all this data all the time across all the networks. Of course, Uncle Reality comes to visit and understandably, collection of everything everywhere isn’t an option. So how do you prioritize? The objective is to use the data you already have. Most organizations have all of the devices listed above. So all the data sources exist, and should be prioritized based on importance to the

Read Post

Totally Transparent Research is the embodiment of how we work at Securosis. It’s our core operating philosophy, our research policy, and a specific process. We initially developed it to help maintain objectivity while producing licensed research, but its benefits extend to all aspects of our business.

Going beyond Open Source Research, and a far cry from the traditional syndicated research model, we think it’s the best way to produce independent, objective, quality research.

Here’s how it works:

  • Content is developed ‘live’ on the blog. Primary research is generally released in pieces, as a series of posts, so we can digest and integrate feedback, making the end results much stronger than traditional “ivory tower” research.
  • Comments are enabled for posts. All comments are kept except for spam, personal insults of a clearly inflammatory nature, and completely off-topic content that distracts from the discussion. We welcome comments critical of the work, even if somewhat insulting to the authors. Really.
  • Anyone can comment, and no registration is required. Vendors or consultants with a relevant product or offering must properly identify themselves. While their comments won’t be deleted, the writer/moderator will “call out”, identify, and possibly ridicule vendors who fail to do so.
  • Vendors considering licensing the content are welcome to provide feedback, but it must be posted in the comments - just like everyone else. There is no back channel influence on the research findings or posts.
    Analysts must reply to comments and defend the research position, or agree to modify the content.
  • At the end of the post series, the analyst compiles the posts into a paper, presentation, or other delivery vehicle. Public comments/input factors into the research, where appropriate.
  • If the research is distributed as a paper, significant commenters/contributors are acknowledged in the opening of the report. If they did not post their real names, handles used for comments are listed. Commenters do not retain any rights to the report, but their contributions will be recognized.
  • All primary research will be released under a Creative Commons license. The current license is Non-Commercial, Attribution. The analyst, at their discretion, may add a Derivative Works or Share Alike condition.
  • Securosis primary research does not discuss specific vendors or specific products/offerings, unless used to provide context, contrast or to make a point (which is very very rare).
    Although quotes from published primary research (and published primary research only) may be used in press releases, said quotes may never mention a specific vendor, even if the vendor is mentioned in the source report. Securosis must approve any quote to appear in any vendor marketing collateral.
  • Final primary research will be posted on the blog with open comments.
  • Research will be updated periodically to reflect market realities, based on the discretion of the primary analyst. Updated research will be dated and given a version number.
    For research that cannot be developed using this model, such as complex principles or models that are unsuited for a series of blog posts, the content will be chunked up and posted at or before release of the paper to solicit public feedback, and provide an open venue for comments and criticisms.
  • In rare cases Securosis may write papers outside of the primary research agenda, but only if the end result can be non-biased and valuable to the user community to supplement industry-wide efforts or advances. A “Radically Transparent Research” process will be followed in developing these papers, where absolutely all materials are public at all stages of development, including communications (email, call notes).
    Only the free primary research released on our site can be licensed. We will not accept licensing fees on research we charge users to access.
  • All licensed research will be clearly labeled with the licensees. No licensed research will be released without indicating the sources of licensing fees. Again, there will be no back channel influence. We’re open and transparent about our revenue sources.

In essence, we develop all of our research out in the open, and not only seek public comments, but keep those comments indefinitely as a record of the research creation process. If you believe we are biased or not doing our homework, you can call us out on it and it will be there in the record. Our philosophy involves cracking open the research process, and using our readers to eliminate bias and enhance the quality of the work.

On the back end, here’s how we handle this approach with licensees:

  • Licensees may propose paper topics. The topic may be accepted if it is consistent with the Securosis research agenda and goals, but only if it can be covered without bias and will be valuable to the end user community.
  • Analysts produce research according to their own research agendas, and may offer licensing under the same objectivity requirements.
  • The potential licensee will be provided an outline of our research positions and the potential research product so they can determine if it is likely to meet their objectives.
  • Once the licensee agrees, development of the primary research content begins, following the Totally Transparent Research process as outlined above. At this point, there is no money exchanged.
  • Upon completion of the paper, the licensee will receive a release candidate to determine whether the final result still meets their needs.
  • If the content does not meet their needs, the licensee is not required to pay, and the research will be released without licensing or with alternate licensees.
  • Licensees may host and reuse the content for the length of the license (typically one year). This includes placing the content behind a registration process, posting on white paper networks, or translation into other languages. The research will always be hosted at Securosis for free without registration.

Here is the language we currently place in our research project agreements:

Content will be created independently of LICENSEE with no obligations for payment. Once content is complete, LICENSEE will have a 3 day review period to determine if the content meets corporate objectives. If the content is unsuitable, LICENSEE will not be obligated for any payment and Securosis is free to distribute the whitepaper without branding or with alternate licensees, and will not complete any associated webcasts for the declining LICENSEE. Content licensing, webcasts and payment are contingent on the content being acceptable to LICENSEE. This maintains objectivity while limiting the risk to LICENSEE. Securosis maintains all rights to the content and to include Securosis branding in addition to any licensee branding.

Even this process itself is open to criticism. If you have questions or comments, you can email us or comment on the blog.