Securosis

Research

Event-Driven AWS Security: A Practical Example

Would you like the ability to revert unapproved security group (firewall) changes in Amazon Web Services in 10 seconds, without external tools? That’s about 10-20 minutes faster than is typically possible with a SIEM or other external tools. If that got your attention, then read on… If you follow me on Twitter, you might have noticed I went a bit nuts when Amazon Web Services announced their new CloudWatch events a couple weeks ago. I saw them as an incredibly powerful too for event driven security. I will post about the underlying concepts tomorrow, but right now I think it’s better to just show you how it works first. This entire thing took about 4 hours to put together, and it was my first time writing a Lambda function or using Python in 10 years. This example configures an AWS account to automatically revert any Security Group (firewall) changes without human interaction, using nothing but native AWS capabilities. No security tools, no servers, nada. Just wiring together things already built into AWS. In my limited testing it’s effective in 10 seconds or less, and it’s only 100 lines of code – including comments. Yes, this post is much longer than the code to make it all work. I will walk you through setting it up manually, but in production you would want to automate this configuration so you can manage it across multiple AWS accounts. That’s what we use Trinity for, and I’ll talk more about automating automation at the end of the post. Also, this is Amazon specific because no other providers yet expose the needed capabilities. For background it might help to read the AWS CloudWatch events launch post. The short version is that you can instrument a large portion of AWS, and trigger actions based on a wide set of very granular events. Yes, this is an example of the kind of research we are focusing on as part of our cloud pivot. This might look long, but if you follow my instructions you can set it all up in 10-15 minutes. Tops. Prep Work: Turn on CloudTrail If you use AWS you should have CloudTrail set up already; if not you need to activate it and feed the logs to CloudWatch using these instructions. This only takes a minute or two if you accept all the defaults. Step 1: Configure IAM To make life easier I put all my code up on the Securosis public GitHub repository. You don’t need to pull that code – you will copy and paste everything into your AWS console. Your first step is to configure an IAM policy for your workflow; then create a role that Lambda can assume when running the code. Lambda is an AWS service that allows you to store and run code based on triggers. Lambda code runs in a container, but doesn’t require you to manage containers or servers for it. You load the code, and then it executes when triggered. You can build entirely serverless architectures with Lambda, which is useful if you want to eliminate most of your attack surface, but that’s a discussion for another day. IAM in Amazon Web Services is how you manage who can do what in your account, including the capabilities of Amazon services themselves. It is ridiculously granular and powerful, and so the most critical security tool for protecting AWS accounts. Log into the AWS console. Got to the Identity and Access Management (IAM) dashboard. Click on Policies, then Create Policy. Choose Create Your Own Policy. Name it lambda_revert_security_group. Enter a description, then copy and paste my policy from GitHub. My policy allows the Lambda function to access CloudWatch logs, write to the log, view security group information, and revoke ingress or egress statements (but not create new ones). Damn, I love granular policies! Once the policy is set you need to Create New Role. This is the role which the Lambda function will assume when it runs. Name it lambda_revert_security_group, assign it an AWS Lambda role type, then attach the lambda_revert_security_group policy you just created. That’s it for the IAM changes. Next you need to set up the Lambda function and the CloudWatch event. Step 2: Create the Lambda function First make sure you know which AWS region you are working in. I prefer us-west-2 (Oregon) for lab work because it is up to date and tends to support new capabilities early. us-east-1 is the granddaddy of regions, but my lab account has so much cruft after 6+ years that things don’t always work right for me there. Go to Lambda (under Compute on the main services page) and Create a Lambda function. Don’t pick a blueprint – hit the Skip button to advance to the next page. Name your function revertSecurityGroup. Enter a description, and pick Python for the runtime. Then paste my code into the main window. After that pick the lambda_revert_security_group IAM role the function will use. Then click Next, then Create function. A few points on Lambda. You aren’t billed until the function triggers; then you are billed per request and runtime. Lambda is very good for quick tasks, but it does have a timeout (I think an hour these days), and the longer you run a function the less sense it makes compared to a dedicated server. I actually looked at migrating Trinity to Lambda because we could offload our workflows, but at that time it had a 5-minute timeout, and running hour-long workflows at scale would likely have killed us financially. Now some notes on my code. The main function handler includes a bunch of conditional statements you can use to only trigger reverting security group changes based on things like who requested the change, which security group was changed, whether the security group is in a specified VPC, and whehter the security group has a particular tag. None of those lines will work for you, because they refer to specific identifiers in my account – you need to change them to work in your account. By default, the function will revert any security group change in your account. You need to cut and paste the line “revert_security_group(event)” into a conditional block to run only on matching conditions. The function only works for inbound rule changes. It

Share:
Read Post

Securing Hadoop: Architectural Security Issues

Now that we have sketched out the elements a Hadoop cluster, and what one looks like, let’s talk threats to the databases. We want to consider both the database infrastructure itself, as well as the data under management. Given the complexity of a Hadoop cluster, the task is closer to securing an entire data center than a typical relational database. All the features that provide flexibility, scalability, performance, and openness, create specific security challenges. The following are some specific threats to clustered databases. Data access & ownership: Role-based access is central to most database security schemes, and NoSQL is no different. Relational and quasi-relational platforms include roles, groups, schemas, label security, and various other facilities for limiting user access to subsets of available data. Most big data environments now offer integration with identity stores, along with role-based facilities to divide up data access between groups of users. That said, authentication and authorization require cooperation between the application designer and the IT team managing the cluster. Leveraging existing Active Directory or LDAP services helps tremendously with defining user identities, and pre-defined roles may be available for limiting access to sensitive data. Data at rest protection: The standard for protecting data at rest is encryption, which protects against attempts to access data outside established application interfaces. With Hadoop systems we worry about people stealing archives or directly reading files from disk. Encrypted files are protected against access by users without encryption keys. Replication effectively replaces backups for big data, but beware a rogue administrator or cloud service manager creating their own backups. Encryption limits how data can be copied from the cluster. Unlike 2012, where the lack of suitable encryption was a serious issue. Apache offers HDFS encryption as an option; this is a major advance, but remember that you can only encrypt HDFS, and you’ll need to fill the gaps with key management and key storage. Several commercial Hadoop vendors offer transparent encryption, and third parties have advanced the state of the art, with transparent encryption options for both both HDFS and non-HDFS on-disk formats, especially coupled with parallel progress in key management. Inter-node communication: Hadoop and the vast majority of distributions (Cassandra, MongoDB, Couchbase, etc.) don’t communicate securely by default – they use unencrypted RPC over TCP/IP. TLS and SSL are bundled in big data distributions, but not typically used between applications and databases – and almost never for inter-node communication. This leaves data in transit, and application queries, accessible for inspection and tampering. Client interaction: Clients interact with resource managers and nodes. While gateway services can be created to load data, clients communicate directly with both resource managers and individual data nodes. Compromised clients can send malicious data or links to either service. This facilitates efficient communication but makes it difficult to protect nodes from clients, clients from nodes, and even name servers from nodes. Worse, the distribution of self-organizing nodes is a poor fit for security tools such as gateways, firewalls, and monitors. Many security tools are designed to require a choke-point or span port, which may not be available in a peer-to-peer mesh cluster. Distributed nodes: One of the reasons big data makes sense is an old truism: “moving computation is cheaper than moving data”. Data is processed wherever resources are available, enabling massively parallel computation. Unfortunately this produces complicated environments with lots of attack surface. With so many moving parts, it is difficult to verify consistency or security across a highly distributed cluster of (possibly heterogeneous) platforms. Patching, configuration management, node identity, and data at rest protection – and consistent deployment of each – are all issues. Threat-response models One or more security countermeasures are available to mitigate each threat identified above. The following diagram shows which specific options you have at your disposal to help you choose a ‘preventative’ security measure. We don’t have room to go into much detail on the tradeoffs of each response – each area really deserves its own paper. But we do want to mention a couple areas where we have seen the most change since our original research four years ago. If your goal is to protect session privacy – either between clients and data nodes, or for inter-node communication – Transport Layer Security (TLS) is your first choice. This was unheard of in 2012, but since then about 25% of the companies we spoke with have implemented SSL or TLS for inter-node communication – not just between applications and name servers. Transport encryption protects all communications from access or modification by attackers. Some firms instead use network segmentation and firewalls to ensure that attackers cannot access network traffic. This approach is less robust but much easier to implement. Some clusters were deployed to third-party cloud services, where virtualized network services make sniffing nearly impossible; these companies typically chose not to encrypt internal cluster communications. Enforcing data usage is one of the areas where we have seen the most progress, thanks to database links into existing Active Directory and LDAP identity stores. This seems obvious now but was a rarity in 2012, when data architects were focused on scalability and getting basic analytics up and running. Fortunately support for linking identity stores to Hadoop clusters has advanced considerably, making it much easier to leverage existing roles and management infrastructure. But we also have other tools at our disposal. We don’t see it often, but a handful of organizations encrypt sensitive data elements at the application layer, so information is stored as encrypted elements. This way the application manages decryption and key management functions, and can offer additional controls over who can see which information. This is very secure, but must be designed in during application design and coded into the application from the beginning. Retrofitting application-layer encryption into an existing application and database stack is highly challenging at beast, which is why we also see wide usage of masking and redaction technologies – from both enterprise Hadoop vendors and third-party security vendors. These technologies offer fine control over which data is displayed to which users, and can be easily built into existing clusters to

Share:
Read Post
dinosaur-sidebar

Totally Transparent Research is the embodiment of how we work at Securosis. It’s our core operating philosophy, our research policy, and a specific process. We initially developed it to help maintain objectivity while producing licensed research, but its benefits extend to all aspects of our business.

Going beyond Open Source Research, and a far cry from the traditional syndicated research model, we think it’s the best way to produce independent, objective, quality research.

Here’s how it works:

  • Content is developed ‘live’ on the blog. Primary research is generally released in pieces, as a series of posts, so we can digest and integrate feedback, making the end results much stronger than traditional “ivory tower” research.
  • Comments are enabled for posts. All comments are kept except for spam, personal insults of a clearly inflammatory nature, and completely off-topic content that distracts from the discussion. We welcome comments critical of the work, even if somewhat insulting to the authors. Really.
  • Anyone can comment, and no registration is required. Vendors or consultants with a relevant product or offering must properly identify themselves. While their comments won’t be deleted, the writer/moderator will “call out”, identify, and possibly ridicule vendors who fail to do so.
  • Vendors considering licensing the content are welcome to provide feedback, but it must be posted in the comments - just like everyone else. There is no back channel influence on the research findings or posts.
    Analysts must reply to comments and defend the research position, or agree to modify the content.
  • At the end of the post series, the analyst compiles the posts into a paper, presentation, or other delivery vehicle. Public comments/input factors into the research, where appropriate.
  • If the research is distributed as a paper, significant commenters/contributors are acknowledged in the opening of the report. If they did not post their real names, handles used for comments are listed. Commenters do not retain any rights to the report, but their contributions will be recognized.
  • All primary research will be released under a Creative Commons license. The current license is Non-Commercial, Attribution. The analyst, at their discretion, may add a Derivative Works or Share Alike condition.
  • Securosis primary research does not discuss specific vendors or specific products/offerings, unless used to provide context, contrast or to make a point (which is very very rare).
    Although quotes from published primary research (and published primary research only) may be used in press releases, said quotes may never mention a specific vendor, even if the vendor is mentioned in the source report. Securosis must approve any quote to appear in any vendor marketing collateral.
  • Final primary research will be posted on the blog with open comments.
  • Research will be updated periodically to reflect market realities, based on the discretion of the primary analyst. Updated research will be dated and given a version number.
    For research that cannot be developed using this model, such as complex principles or models that are unsuited for a series of blog posts, the content will be chunked up and posted at or before release of the paper to solicit public feedback, and provide an open venue for comments and criticisms.
  • In rare cases Securosis may write papers outside of the primary research agenda, but only if the end result can be non-biased and valuable to the user community to supplement industry-wide efforts or advances. A “Radically Transparent Research” process will be followed in developing these papers, where absolutely all materials are public at all stages of development, including communications (email, call notes).
    Only the free primary research released on our site can be licensed. We will not accept licensing fees on research we charge users to access.
  • All licensed research will be clearly labeled with the licensees. No licensed research will be released without indicating the sources of licensing fees. Again, there will be no back channel influence. We’re open and transparent about our revenue sources.

In essence, we develop all of our research out in the open, and not only seek public comments, but keep those comments indefinitely as a record of the research creation process. If you believe we are biased or not doing our homework, you can call us out on it and it will be there in the record. Our philosophy involves cracking open the research process, and using our readers to eliminate bias and enhance the quality of the work.

On the back end, here’s how we handle this approach with licensees:

  • Licensees may propose paper topics. The topic may be accepted if it is consistent with the Securosis research agenda and goals, but only if it can be covered without bias and will be valuable to the end user community.
  • Analysts produce research according to their own research agendas, and may offer licensing under the same objectivity requirements.
  • The potential licensee will be provided an outline of our research positions and the potential research product so they can determine if it is likely to meet their objectives.
  • Once the licensee agrees, development of the primary research content begins, following the Totally Transparent Research process as outlined above. At this point, there is no money exchanged.
  • Upon completion of the paper, the licensee will receive a release candidate to determine whether the final result still meets their needs.
  • If the content does not meet their needs, the licensee is not required to pay, and the research will be released without licensing or with alternate licensees.
  • Licensees may host and reuse the content for the length of the license (typically one year). This includes placing the content behind a registration process, posting on white paper networks, or translation into other languages. The research will always be hosted at Securosis for free without registration.

Here is the language we currently place in our research project agreements:

Content will be created independently of LICENSEE with no obligations for payment. Once content is complete, LICENSEE will have a 3 day review period to determine if the content meets corporate objectives. If the content is unsuitable, LICENSEE will not be obligated for any payment and Securosis is free to distribute the whitepaper without branding or with alternate licensees, and will not complete any associated webcasts for the declining LICENSEE. Content licensing, webcasts and payment are contingent on the content being acceptable to LICENSEE. This maintains objectivity while limiting the risk to LICENSEE. Securosis maintains all rights to the content and to include Securosis branding in addition to any licensee branding.

Even this process itself is open to criticism. If you have questions or comments, you can email us or comment on the blog.