Securosis

Research

Off Topic: Every Time You Buy A Ringtone A Kitten Dies

My title, but must-read content at Daring Fireball. Remember, the media companies are trying to condition you into paying more for every piece of content you use. More money for every device, every viewing, every time you make a mix tape with those perfect songs to bring back your lost love. Heck, the RIAA is actively petitioning to pay artists less (if at all) for ringtones and other uses of the artists’ content, so it’s not like your favorite drug-deprived musician is missing out on getting a fix meal when you buy these things. Just say no, people. Fair use lets you rip your own songs and convert them. Exercise your rights or lose them. Now back to your regularly scheduled security paranioia… Share:

Share:
Read Post

The Pink Taco Claims Another Victim

Richard Stiennon was in town last night, so I took him out for everyone’s favorite local Mexican food. No, it’s not obscene, it’s a normal place with an amusing name. I just like dragging all my security friends there so they have to turn in expense reports with “Pink Taco” on them. Don’t worry Greg, we’ll get you a hat… Share:

Share:
Read Post

Understanding and Selecting a DLP Solution: Part 2, Content Awareness

Welcome to part 2 of our series on helping you better understand Data Loss Prevention solutions. In Part 1 I gave an overview of DLP, and based on follow-up questions it’s clear that one of the most confusing aspects of DLP is content awareness. Content awareness is a high level term I use to describe the ability of a product to look into, and understand, content. A product is considered content aware if it uses one, or many, content analysis techniques. Today we’ll look at these different analysis techniques, how effective they may or may not be, and what kinds of data they work best with. First we need to separate content from context. It’s easiest to think of content as a letter, and context as the envelope and environment around it. Context includes things like source, destination, size, recipients, sender, header information, metadata, time, format, and anything else aside from the content of the letter itself. Context is highly useful and any DLP solution should include contextual analysis as part of the overall solution. But context alone isn’t sufficient. One early data protection solution could track files based on which server they came from, where they were going, and what actions users attempted on the file. While it could stop a file from a server designated “sensitive” from being emailed out from a machine with the data protection software installed, it would miss untracked versions of the file, movement from systems without the software installed, and a whole host of other routes that weren’t even necessarily malicious. This product lacked content awareness and its utility for protecting data was limited (it has since added content awareness, one reason I won’t name the product). The advantage of content awareness is that while we use context, we’re not restricted by it. If I want to protect a piece of sensitive data I want to protect it everywhere- not only when it’s in a flagged envelope. I care about protecting the data, not the envelope, so it makes a lot more sense to open the letter, read it, and then decide how to treat it. Of course that’s a lot harder and more time consuming. That’s why content awareness is the single most important piece of technology in a DLP solution. Opening an envelope and reading a letter is a lot slower than just reading the label- assuming you can even understand the handwriting and language. The first step in content analysis is capturing the envelope and opening it. I’ll skip the capturing part for now- we’ll talk about that later- and assume we can get the envelope to the content analysis engine. The engine then needs to parse the context (we’ll need that for the analysis) and dig into the content. For a plain text email this is easy, but when you want to look inside binary files it gets a little more complicated. All DLP solutions solve this using file cracking. File cracking is the technology used to read and understand the file, even if the content is buried multiple levels down. For example, it’s not unusual for the file cracker to read an Excel spreadsheet embedded in a Word file that’s zipped. The product needs to unzip the file, read the Word doc, analyze it, find the Excel data, read that, and analyze it. Other situations get far more complex, like a .pdf embedded in a CAD file. Many of the products on the market today support around 300 file types, embedded content, multiple languages, double byte character sets (for Asian languages), and can pull plain text from unidentified file types. Quite a few use the Autonomy or Verity content engines to help with file cracking, but all the serious tools have their own set of proprietary capabilities around that. Some tools can support analysis of encrypted data if they have the recovery keys for enterprise encryption, and most can identify standard encryption and use that as a contextual rule to block/quarantine content. Rather than just talking about how hard this is and seeing how far I drag out an analogy, let’s jump in and look at the different content analysis techniques used today: 1. Rules-Based/Regular Expressions: This is the most common analysis technique available in both DLP products, and other tools with DLP-like features. It analyzes the content for specific rules- such as 16 digit numbers that meet credit card checksum requirements, medical billing codes, and other textual analysis. Most DLP solutions enhance basic regular expressions with their own additional analysis rules (e.g. a name in proximity to an address near a credit card number). What content it’s best for: As a first-pass filter, or simply identified pieces of structured data like credit card numbers, social security numbers, and healthcare codes/records. Strengths: Rules process quickly and can easily be configured. Most products ship with initial rules sets. The technology is well understood and easy to incorporate into a variety of products. Weaknesses: Prone to high false positive rates. Offer very little protection for unstructured content like sensitive intellectual property. 2. Database Fingerprinting: Sometimes called Exact Data Matching. This technique takes either a database dump or live data (via ODBC connection) from a database and only looks for exact matches. For example, you could generate a policy to look only for credit card numbers in your customer base, thus ignoring your own employees buying online. More advanced tools look for combinations of information, such as the magic combination of first name or initial, with last name, with credit card or social security number, that triggers a California SB 1386 disclosure. Make sure you understand the performance and security implications of nightly extractions vs. live database connections. What content it’s best for: Structured data from databases. Strengths: Very low false positives (close to 0). Allows you to protect customer/sensitive data while ignoring other, similar, data used by employees (like their personal credit cards for online orders). Weaknesses: Nightly dumps won’t contain transaction data since the last extraction. Live connections can affect database performance. Large databases will affect product performance. 3. Exact File Matching: With this technique you take a hash of a file and monitor

Share:
Read Post

Article Published On TidBITS

Just a quick note that I just published an article over at TidBITS called The Ghost in My FileVault. It’s a tale of terror from a recent trip to Asia. Here’s an excerpt: All men have fears. Many fear those physical threats wired into our souls through millions of years of surviving this harsh world. Fears of heights, confinement, venomous creatures, darkness, or even the ultimate fear of becoming prey can paralyze the strongest and bravest of our civilization. These are not my fears. I climb, crawl, jump, battle, and explore this world; secure in my own skills. My fears are not earthly fears. My fears are not those of the natural world. This is a story of confronting my greatest terror, living to tell the tale, and wondering if the threat is really over. The tale starts, as they always do, on a dark and stormy night. If you don’t know TidBITS, and you use a Mac, you should go flog yourself. Go sign up for the weekly newsletter. (Here’s a quick link to my tutorial on using FileVault) Share:

Share:
Read Post
dinosaur-sidebar

Totally Transparent Research is the embodiment of how we work at Securosis. It’s our core operating philosophy, our research policy, and a specific process. We initially developed it to help maintain objectivity while producing licensed research, but its benefits extend to all aspects of our business.

Going beyond Open Source Research, and a far cry from the traditional syndicated research model, we think it’s the best way to produce independent, objective, quality research.

Here’s how it works:

  • Content is developed ‘live’ on the blog. Primary research is generally released in pieces, as a series of posts, so we can digest and integrate feedback, making the end results much stronger than traditional “ivory tower” research.
  • Comments are enabled for posts. All comments are kept except for spam, personal insults of a clearly inflammatory nature, and completely off-topic content that distracts from the discussion. We welcome comments critical of the work, even if somewhat insulting to the authors. Really.
  • Anyone can comment, and no registration is required. Vendors or consultants with a relevant product or offering must properly identify themselves. While their comments won’t be deleted, the writer/moderator will “call out”, identify, and possibly ridicule vendors who fail to do so.
  • Vendors considering licensing the content are welcome to provide feedback, but it must be posted in the comments - just like everyone else. There is no back channel influence on the research findings or posts.
    Analysts must reply to comments and defend the research position, or agree to modify the content.
  • At the end of the post series, the analyst compiles the posts into a paper, presentation, or other delivery vehicle. Public comments/input factors into the research, where appropriate.
  • If the research is distributed as a paper, significant commenters/contributors are acknowledged in the opening of the report. If they did not post their real names, handles used for comments are listed. Commenters do not retain any rights to the report, but their contributions will be recognized.
  • All primary research will be released under a Creative Commons license. The current license is Non-Commercial, Attribution. The analyst, at their discretion, may add a Derivative Works or Share Alike condition.
  • Securosis primary research does not discuss specific vendors or specific products/offerings, unless used to provide context, contrast or to make a point (which is very very rare).
    Although quotes from published primary research (and published primary research only) may be used in press releases, said quotes may never mention a specific vendor, even if the vendor is mentioned in the source report. Securosis must approve any quote to appear in any vendor marketing collateral.
  • Final primary research will be posted on the blog with open comments.
  • Research will be updated periodically to reflect market realities, based on the discretion of the primary analyst. Updated research will be dated and given a version number.
    For research that cannot be developed using this model, such as complex principles or models that are unsuited for a series of blog posts, the content will be chunked up and posted at or before release of the paper to solicit public feedback, and provide an open venue for comments and criticisms.
  • In rare cases Securosis may write papers outside of the primary research agenda, but only if the end result can be non-biased and valuable to the user community to supplement industry-wide efforts or advances. A “Radically Transparent Research” process will be followed in developing these papers, where absolutely all materials are public at all stages of development, including communications (email, call notes).
    Only the free primary research released on our site can be licensed. We will not accept licensing fees on research we charge users to access.
  • All licensed research will be clearly labeled with the licensees. No licensed research will be released without indicating the sources of licensing fees. Again, there will be no back channel influence. We’re open and transparent about our revenue sources.

In essence, we develop all of our research out in the open, and not only seek public comments, but keep those comments indefinitely as a record of the research creation process. If you believe we are biased or not doing our homework, you can call us out on it and it will be there in the record. Our philosophy involves cracking open the research process, and using our readers to eliminate bias and enhance the quality of the work.

On the back end, here’s how we handle this approach with licensees:

  • Licensees may propose paper topics. The topic may be accepted if it is consistent with the Securosis research agenda and goals, but only if it can be covered without bias and will be valuable to the end user community.
  • Analysts produce research according to their own research agendas, and may offer licensing under the same objectivity requirements.
  • The potential licensee will be provided an outline of our research positions and the potential research product so they can determine if it is likely to meet their objectives.
  • Once the licensee agrees, development of the primary research content begins, following the Totally Transparent Research process as outlined above. At this point, there is no money exchanged.
  • Upon completion of the paper, the licensee will receive a release candidate to determine whether the final result still meets their needs.
  • If the content does not meet their needs, the licensee is not required to pay, and the research will be released without licensing or with alternate licensees.
  • Licensees may host and reuse the content for the length of the license (typically one year). This includes placing the content behind a registration process, posting on white paper networks, or translation into other languages. The research will always be hosted at Securosis for free without registration.

Here is the language we currently place in our research project agreements:

Content will be created independently of LICENSEE with no obligations for payment. Once content is complete, LICENSEE will have a 3 day review period to determine if the content meets corporate objectives. If the content is unsuitable, LICENSEE will not be obligated for any payment and Securosis is free to distribute the whitepaper without branding or with alternate licensees, and will not complete any associated webcasts for the declining LICENSEE. Content licensing, webcasts and payment are contingent on the content being acceptable to LICENSEE. This maintains objectivity while limiting the risk to LICENSEE. Securosis maintains all rights to the content and to include Securosis branding in addition to any licensee branding.

Even this process itself is open to criticism. If you have questions or comments, you can email us or comment on the blog.