Big Data Holdup?

Computerworld UK ran an interesting article on how Deutsche Bank and HMRC are struggling to integrate Hadoop systems with legacy infrastructure. This is a very real problem for very large enterprises with significant investments in mainframes, Teradata, Grids, MPP, EDW, whatever. From the post: Zhiwei Jiang, global head of accounting and finance IT at Deutsche Bank, was speaking this week at a Cloudera roundtable discussion on big data. He said that the bank has embarked on a project to analyse large amounts of unstructured data, but is yet to understand how to make the Hadoop system work with legacy IBM mainframes and Oracle databases. “We have been working with Cloudera since the beginning of last year, where for the next two years I am on a mission to collect as much data as possible into a data reservoir,” said Jiang. I want to make two points. First, I don’t think this particular issue applies to most corporate IT. In fact, from my perspective, there is no holdup with large corporations jumping into big data. Most are already there. Why? Because marketing organizations have credit cards. They hire a data architect, spin up a cloud instance, and are off and running. Call it Rogue IT, but it’s working for them. They’re getting good results. They are performing analytics on data that was previously cost-prohibitive, and it’s making them better. They are not waiting around for corporate IT and governance to decide where data can go and who will enforce policies. Just like BYOD, they are moving forward, and they’ll ask forgiveness later. As far as very large corporations integrating the old and the new, it’s smart to look to leverage existing data sets. To the firms referenced in the article, if analytic system integration is a requirement, this is a very real problem. Integration, or at the very least sharing data, is not an easy technical problem. That said, my personal take on the whole slowdown of adoption, unless you have compliance or governance constraints, is “Don’t do it.” If it’s purely a desire to leverage existing multi-million dollar investments, it may not be cost effective to do so. Commodity computing resources are incredibly cheap, and the software is virtually free. Copy the data and move on. Leveraging existing infrastructure is great, but it will likely save money to move data into NoSQL clusters, and extend capabilities on these newer platforms. That said, compliance, security and corporate governance of these systems – and the data they will house – is not well understood. Worse, extending security and corporate governance may not be feasible on most NoSQL platforms. Share:

Read Post

Quantify Me: Friday Summary: February 15, 2013

Rich here. There are very few aspects of my life I don’t track, tag, analyze, and test. You could say I’m part of the “Quantified Self” movement if it weren’t for the fact that the only movement I like to participate in involves sitting down, usually with a magazine or newspaper. I track all my movements during the day with a Jawbone Up (when it isn’t broken). I track my workouts with a Garmin 910XT, which looks like a watch designed by a Russian gangster, but is really a fitness computer that collects my heart rate, GPS coordinates, foot-pod accelerometer data, and bike data; and can even tell me which swimming stroke, how long, and how far I am using in my feeble attempts to avoid drowning. My bike trainer uses a Kurt Kinetic InRide power meter for those days my heart rate is lying to me about how hard I’m pushing. I track my sleep with a Zeo, test my blood with WellnessFX, and screen my genes with 23andMe. I correlate most of my fitness data in TrainingPeaks, which uses math and data to track my fitness level and overall training stress, and optimize my workouts whichever data collection device du jour I have with me. My swim coach (when I use him) uses video and an endless pool to slowly move me from “avoiding drowning in a forward direction” to “something that almost resembles swimming”. My bike is custom fit based on video, my ride style, and power output and balance measurements; the next one will probably be calibrated from computerized real-time analysis and those dot trackers used for motion capture films. Every morning I track my weight with a WiFi enabled scale that automatically connects to TrainingPeaks to track trends. I can access nearly all this data from my phone, and I am probably forgetting things. Some days I wonder if this all makes a difference, especially when I think back to my hand-written running and lifting logs, and the early days using a basic heart rate monitor with no data recording. Or the earlier days when I’d just run for running’s sake, without so much as headphones on. But when I sit back and crunch the numbers, I do find tidbits that affect the quality of my life and training. I have learned that I tend to average three deep sleep cycles a night, but one is usually between 6-8 am, which is when I almost always wake up. Days I sleep in a bit and get that extra cycle correlate with a significant upswing in how well I feel, and my work productivity. When the kids are older I will most definitely adjust my schedule – getting that sleep even 1-2 days a week make a big difference. I am somewhat biphasic, and if I’m up in the middle of the night for an hour or so I still feel good if I get that morning rest. With a new baby coming, I will really get to test this out. I am naturally a sprinter. I knew this based on my athletic history, but genetics confirms it. I was insanely fast when I competed in martial arts, but always had stamina issues (keep the jokes to yourself). As I have moved into endurance sports this has been a challenge, but I can now tune my training to hit specific goals with great success and very little wasted effort. I have learned that although I can take a ton of high-intensity training punishment, if I am otherwise stressed in life at the same time I get particular complications. I am in the midst of tweaking my diet to fit my lifestyle and health goals. I have a genetic disposition to heart disease, and my numbers prove it, but I have managed to make major strides through diet. Without being able to make these changes and then test the results, I would be flying blind. I’m learning exactly what works for me. This helped me lose 10 pounds in less than a month with only minimal diet changes, for example, and drop my cholesterol by 40 points. Not all of the data I collect is overly useful. I’m still seeing where steps-per-day fit in, but I think that is more a daily motivator to keep me moving. The genetics testing with 23andMe was interesting, but we’ll see whether it affects any future health decisions. Perhaps if I need to go on statins someday, since I don’t carry a genetic sensitivity that can really cause problems. It’s obsessive (but not as obsessive as my friend Chris Hoff), but it does provide incredible control over my own health. Life is complex, and no single diet or fitness regimin works the same for everyone. From how I work out, to how I sleep, to what I eat, I am learning insanely valuable lessons that I then get to test and validate. I can’t emphasize how much more effective this is than the guesswork I had to live with before these tools became available. I plan on living a long time, and being insanely active until the bitter end. I’m in my 40s, and can no longer do whatever I want and rely on youth to clean up my mistakes. Data is awesome. Measure, analyze, correct, repeat. Without that cycle you are flying in the dark, and this is as true for security (or anything else, really) as it is for health. On to the Summary: Webcasts, Podcasts, Outside Writing, and Conferences Rich’s password rant at Macworld. Favorite Securosis Posts Mike Rothman: RSA Conference Guide 2013: Cloud Security. Rich did a good job highlighting one of the major hype engines we’ll see at the RSA Conference. And he got to write SECaaS. Win! Adrian Lane: LinkedIn Endorsements Are Social Engineering. As LinkedIn looks desperately for ways to be more than just contact management, Rich nails the latest attempt. David Mortman: Directly Asking the Security Data. Rich: The Increasing Irrelevance of Vulnerability Disclosure. Yep. Other Securosis Posts RSA Conference Guide 2013: Application Security. I’m losing track – is this

Read Post

I’m losing track—is this ANOTHER Adobe 0-day?

As reported on Tom’s Guide, FireEye reports they have discovered a PDF 0-Day that is currently being exploited in the wild: According to the report, this exploit drops two DLLs upon successful exploitation, one of which displays a fake error message and opens a decoy PDF document. The second DLL drops the callback component which talks to a remote domain. “We have already submitted the sample to the Adobe security team,” the firm stated on Wednesday in this blog. “Before we get confirmation from Adobe and a mitigation plan is available, we suggest that you not open any unknown PDF files. We will continue our research and continue to share more information.” And note that this is not just a Windows issue – Linux and OS X versions are also susceptible. So avoid using unknown PDF files – that is the recommended work-around – while you wait for a patch. No kidding! Personally I just disabled Adobe Reader from my machine and I’ll consider re-enabling at some point in the future. Some of you don’t have this option, so use caution. Share:

Read Post

RSA Conference Guide 2013: Application Security

So what hot trends in application security will you see at the RSA Conference? Mostly the same as last year’s trends, as lots of things are changing in security, but not much on the appsec front. Application security is a bit like security seasoning: Companies add a sprinkle of threat modeling here, a dash of static analysis there, marinate for a bit with some dynamic app testing (DAST), and serve it all up on a bed of WAF. The good news is that we see some growth in security adoption in every phase of application development (design, implementation, testing, deployment, developer education), with the biggest gains in WAF and DAST. Additionally, according to many studies – including the SANS application security practices survey – better than 2/3 of software development teams have an application security program in place. The Big Money Game With WhiteHat Security closing a $31M funding round, and Veracode racking up $30M themselves in 2012, there won’t be any shortage of RSA Conference party dollars for application security. Neither of these companies are early stage, and the amount of capital raised indicates they need fuel to accelerate expansion. In all seriousness, the investment sharks smell the chum and are making their kills. When markets start to get hot you typically see companies in adjacent markets reposition and extend into the hot areas. That means you should expect to see new players, expanded offerings from old players, and (as in all these RSA Guide sections) no lack of marketing to fan the hype flames (or at least smoke). But before you jump in, understand the differences and what you really need from these services. The structure of your development and security teams, the kinds of applications you work with, your development workflow, and even your reliance on external developers will all impact what direction you head in. Then, when you start talking to company reps on the show floor, dig into their methodology, technology, and the actual people they use behind any automated tools to reduce false positives. See if you can get a complete sample assessment report, from a real scan; preferably provided by a real user, because that gives you a much better sense of what you can expect. And don’t forget to get your invite to the party. Risk(ish) Quantification(y) One of the new developments in the field of application security is trying out new metrics to better resonate with the keymasters of the moneybags. Application security vendors pump out a report saying your new code still has security bugs and you’re sitting on a mountain of “technical debt”, which basically quantifies how much crappy old code you don’t have time or resources to fix. Vendors know that Deming’s principles, the threat of a data breach, compliance requirements, and rampant fraud have not been enough whip companies into action. The conversation has shifted to Technical Debt, Cyber Insurance, Factor Analysis of Information Risk (FAIR), the Zombie Apocalypse and navel gazing at how well we report breach statistics. The common thread through all these is the providing a basis to quantify and evaluate risk/reward tradeoffs in application security. Of course it’s not just vendors – security and development teams also use this approach to get management buy-in and better resource allocation for security. The application security industry as a whole is trying to get smarter and more effective in how it communicates (and basically sells) the application security problem. Companies are not just buying application security technologies ad hoc – they are looking to more effectively apply limited resources to the problem. Sure, you will continue to hear the same statistics and all about the urgency of fixing the same OWASP Top 10 threats, but the conversation has changed from “The End is Nigh” to “Risk Adjusted Application Security”. That’s a positive development. (Please Don’t Ask Us About) API Security Just like last year, people are starting to talk about “Big Data Security,” which really means securing a NoSQL cluster against attack. What they are not talking about is securing the applications sitting in front of the big data cluster. That could be Ruby, Java, JSON, Node.js, or any one of the other languages used to front big data. Perhaps you have heard that Java had a couple security holes. Don’t think for a minute these other platforms are going to be more secure than Java. And as application development steams merrily on, each project leveraging new tools to make coding faster and easier, little (okay – no) regard is being paid to the security of these platforms. Adoption of RESTful APIs makes integration faster and easier, but unless carefully implemented they pose serious security risks. Re-architecture and re-design efforts to make applications more secure are an anomaly, not a trend. This is a serious problem that won’t have big hype behind it at RSA because there is no product to solve this issue. We all know how hard it is to burn booth real estate on things that don’t end up on a PO. So you’ll hear how insecure Platform X is, and be pushed to buy an anti-malware/anti-virus solution to detect the attack once your application has been hacked. So much for “building security in”. And don’t forget to register for the Disaster Recovery Breakfast if you’ll be at the show on Thursday morning. Where else can you kick your hangover, start a new one, and talk shop with good folks in a hype-free zone? Nowhere, so make sure you join us… Share:

Read Post

Don’t Bring BS to a Data Fight

Thanks to a heads-up from our Frozen Tundra correspondent, Jamie Arlen, I got to read this really awesome response by Elon Musk of Tesla refuting the findings of a NYT car reviewer, A Most Peculiar Test Drive. After a negative experience several years ago with Top Gear, a popular automotive show, where they pretended that our car ran out of energy and had to be pushed back to the garage, we always carefully data log media drives. While the vast majority of journalists are honest, some believe the facts shouldn’t get in the way of a salacious story. The logs show again that our Model S never had a chance with John Broder. Logs? Oh crap. You think the reviewer realized Tesla would be logging everything? Uh, probably not. Then Musk goes through all the negative claims and pretty much shows the reviewer to be either not very bright (to drive past a charging station when the car clearly said it needed a charge) or deliberately trying to prove his point, regardless of the facts. I should probably just use Jamie’s words, as they are much better than mine. So courtesy of Jamie Arlen: It’s one of those William Gibson moments. You know, where “the future is here, it’s just not evenly distributed yet.” As more “things in the world” get smart and connected, Moore’s Law type interactions occur. The technology necessary to keep a Tesla car running and optimized requires significant monitoring and logging of all control systems, which has an unpleasant side effect for the reviewer. The kicker (for me) in all of this is the example that the NYT writer makes of himself: Sorry dude, the nerds have in-fact inherited the earth – if you want to play a game with someone who excels in the world of high-performance cars and orbital launch systems simultaneously, you need to be at least as smart as your opponent. Mr. Broder – you’ve cast yourself as Vizzini and yes, Elon does make a dashing Dread Pirate Roberts. Vizzini. Well played, Mr. Arlen. Well played. But Jamie’s point is right on the money – these sophisticated vehicle control systems may be intended to make sure the systems are running as they should. But clearly a lot can be done with the data after something happens. How about placing a car at the scene of a crime? Yeah, the possibilities are endless, but I’ll leave those discussions to Captain Privacy. I’m just happy data won over opinion in this case. UPDATE: It looks like we will get to to have a little he said/she said drama here, as Rebecca Greenfield tells Broder’s side of the story in this Atlantic Wire post. As you can imagine, the truth probably is somewhere in the middle. Share:

Read Post

Totally Transparent Research is the embodiment of how we work at Securosis. It’s our core operating philosophy, our research policy, and a specific process. We initially developed it to help maintain objectivity while producing licensed research, but its benefits extend to all aspects of our business.

Going beyond Open Source Research, and a far cry from the traditional syndicated research model, we think it’s the best way to produce independent, objective, quality research.

Here’s how it works:

  • Content is developed ‘live’ on the blog. Primary research is generally released in pieces, as a series of posts, so we can digest and integrate feedback, making the end results much stronger than traditional “ivory tower” research.
  • Comments are enabled for posts. All comments are kept except for spam, personal insults of a clearly inflammatory nature, and completely off-topic content that distracts from the discussion. We welcome comments critical of the work, even if somewhat insulting to the authors. Really.
  • Anyone can comment, and no registration is required. Vendors or consultants with a relevant product or offering must properly identify themselves. While their comments won’t be deleted, the writer/moderator will “call out”, identify, and possibly ridicule vendors who fail to do so.
  • Vendors considering licensing the content are welcome to provide feedback, but it must be posted in the comments - just like everyone else. There is no back channel influence on the research findings or posts.
    Analysts must reply to comments and defend the research position, or agree to modify the content.
  • At the end of the post series, the analyst compiles the posts into a paper, presentation, or other delivery vehicle. Public comments/input factors into the research, where appropriate.
  • If the research is distributed as a paper, significant commenters/contributors are acknowledged in the opening of the report. If they did not post their real names, handles used for comments are listed. Commenters do not retain any rights to the report, but their contributions will be recognized.
  • All primary research will be released under a Creative Commons license. The current license is Non-Commercial, Attribution. The analyst, at their discretion, may add a Derivative Works or Share Alike condition.
  • Securosis primary research does not discuss specific vendors or specific products/offerings, unless used to provide context, contrast or to make a point (which is very very rare).
    Although quotes from published primary research (and published primary research only) may be used in press releases, said quotes may never mention a specific vendor, even if the vendor is mentioned in the source report. Securosis must approve any quote to appear in any vendor marketing collateral.
  • Final primary research will be posted on the blog with open comments.
  • Research will be updated periodically to reflect market realities, based on the discretion of the primary analyst. Updated research will be dated and given a version number.
    For research that cannot be developed using this model, such as complex principles or models that are unsuited for a series of blog posts, the content will be chunked up and posted at or before release of the paper to solicit public feedback, and provide an open venue for comments and criticisms.
  • In rare cases Securosis may write papers outside of the primary research agenda, but only if the end result can be non-biased and valuable to the user community to supplement industry-wide efforts or advances. A “Radically Transparent Research” process will be followed in developing these papers, where absolutely all materials are public at all stages of development, including communications (email, call notes).
    Only the free primary research released on our site can be licensed. We will not accept licensing fees on research we charge users to access.
  • All licensed research will be clearly labeled with the licensees. No licensed research will be released without indicating the sources of licensing fees. Again, there will be no back channel influence. We’re open and transparent about our revenue sources.

In essence, we develop all of our research out in the open, and not only seek public comments, but keep those comments indefinitely as a record of the research creation process. If you believe we are biased or not doing our homework, you can call us out on it and it will be there in the record. Our philosophy involves cracking open the research process, and using our readers to eliminate bias and enhance the quality of the work.

On the back end, here’s how we handle this approach with licensees:

  • Licensees may propose paper topics. The topic may be accepted if it is consistent with the Securosis research agenda and goals, but only if it can be covered without bias and will be valuable to the end user community.
  • Analysts produce research according to their own research agendas, and may offer licensing under the same objectivity requirements.
  • The potential licensee will be provided an outline of our research positions and the potential research product so they can determine if it is likely to meet their objectives.
  • Once the licensee agrees, development of the primary research content begins, following the Totally Transparent Research process as outlined above. At this point, there is no money exchanged.
  • Upon completion of the paper, the licensee will receive a release candidate to determine whether the final result still meets their needs.
  • If the content does not meet their needs, the licensee is not required to pay, and the research will be released without licensing or with alternate licensees.
  • Licensees may host and reuse the content for the length of the license (typically one year). This includes placing the content behind a registration process, posting on white paper networks, or translation into other languages. The research will always be hosted at Securosis for free without registration.

Here is the language we currently place in our research project agreements:

Content will be created independently of LICENSEE with no obligations for payment. Once content is complete, LICENSEE will have a 3 day review period to determine if the content meets corporate objectives. If the content is unsuitable, LICENSEE will not be obligated for any payment and Securosis is free to distribute the whitepaper without branding or with alternate licensees, and will not complete any associated webcasts for the declining LICENSEE. Content licensing, webcasts and payment are contingent on the content being acceptable to LICENSEE. This maintains objectivity while limiting the risk to LICENSEE. Securosis maintains all rights to the content and to include Securosis branding in addition to any licensee branding.

Even this process itself is open to criticism. If you have questions or comments, you can email us or comment on the blog.