Securosis

Research

Firestarter: 2019: Insert Winter is Coming Meme Here

In this year-end/start firestarter the gang jumps into our expectations for the coming year. Spoiler alert- the odds are some consolidation and contraction in security markets are impending… and not just because the Chinese are buying fewer iPhones. Watch or listen: Share:

Share:
Read Post

Quick Wins with Data Guardrails and Behavioral Analytics

This is the third (and final) post in our series on Protecting What Matters: Introducing Data Guardrails and Behavioral Analytics. Our first post, Introducing Data Guardrails and Behavioral Analytics: Understand the Mission we introduced the concepts and outlined the major categories of insider risk. In the second post we delved into and defined the terms. And as we wrap up the series, we’ll bring it together via a scenario showing how these concepts would work in practice As we wrap up the Data Guardrails and Behavioral Analytics series, let’s go through a quick scenario to provide a perspective on how these concepts apply to a simplistic example. Our example company is a small pharmaceutical company. As with all pharma companies, much of their value lies in intellectual property, which makes that the most significant target for attackers. Thanks to fast growth and a highly competitive market, the business isn’t waiting for perfect infrastructure and controls before launching products and doing partnerships. Being a new company without legacy infrastructure (or mindset), a majority of the infrastructure has been built in the cloud and they take a cloud-first approach. In fact, the CEO has been recognized for their innovative use of cloud-based analytics to accelerate the process of identifying new drugs. As excited as the CEO is about these new computing models, the board is very concerned about both external attacks and insider threats as their proprietary data resides in dozens of service providers. So, the security team feels pressure to do something to address the issue. The CISO is very experienced, but is still coming to grips with the changes in mindset, controls and operational motions inherent to a cloud-first approach. Defaulting to the standard data security playbook represents the path of least resistance, but she’s savvy enough to know that would create significant gaps in both visibility and control of the company’s critical intellectual property. The approach of using Data Guardrails and Data Behavioral Analytics presents an opportunity to both define a hard set of policies for data usage and protection, as well as watch for anomalous behaviors potentially indicating malicious intent. So let’s see how she would lead her organization thru a process to define Data Guardrails and Behavioral Analytics. Finding the Data As we mentioned in the previous post, what’s unique about data guardrails and behavioral analytics is combining content knowledge (classification) with context and usage. Thus, the first steps we’ll take is classifying the sensitive data within the enterprise. This involves undertaking an internal discovery of data resources. The technology to do this is mature and well understood, although they need to ensure discovery extends to cloud-based resources. Additionally, they need to talk to the senior leaders of the business to make sure they understand how business strategy impacts application architecture and therefore the location of sensitive data. Internal private research data and clinical trials make up most of the company’s intellectual property. This data can be both structured and unstructured, complicating the discovery process. This is somewhat eased as the organization has embraced cloud storage to centralize the unstructured data and embrace SaaS wherever possible for front office functions. A lot of the emerging analytics use cases continue to provide a challenge to protect, given the relatively immature operational processes in their cloud environments. As with everything else security, visibility comes before control, and this discovery and classification process needs to be the first thing done to get the data security process moving. To be clear, having a lot of the data in a cloud service addressable via an API doesn’t help keep the classification data current. This remains one of the bigger challenges to data security, and as such requires specific activities (and the associated resources allocated) to keep the classification up to date as the process rolls into production. Defining Data Guardrails As we’ve mentioned previously, guardrails are rule sets that keep users within the lines of authorized activity. Thus, the CISO starts by defining the authorized actions and then enforcing those policies where the data resides. For simplicity’s sake, we’ll break the guardrails into three main categories: Access: These guardrails have to do with enforcing access to the data. For instance, files relating to recruiting participants in a clinical trial need to be heavily restricted to the group tasked with recruitment. If someone were to open up access to a broader group, or perhaps tag the folder as public, the guardrail would remove that access and restrict it to the proper group. Action: She will also want to define guardrails on who can do what with the data. It’s important to prevent someone from deleting data or copying it out of the analytics application, thus these guardrails ensure the integrity of the data by preventing misuse, whether intentional/malicious or accidental. Operational: The final category of guardrails controls the operational integrity and resilience of the data. Enterprising data scientists can load up new analytics environments quickly and easily, but may not take the necessary precautions to ensure data back up or required logging/monitoring happens. Guardrails to implement automatic back-ups and monitoring can be set up as part of every new analytics environment. The key in designing guardrails is to think of them as enablers, not blockers. The effectiveness of exception handling typically is the difference between a success and failure in implementing guardrails. To illuminate this, let’s consider a joint venture the organization has with a smaller biotech company. A guardrail exists to restrict access to the data related to this product to a group of 10 internal researchers. Yet clearly researchers from the joint venture partner need access as well, so you’ll need to expand the access rules of the guardrail. But you also may want to enforce multi-factor authentication on those external users or possibly implement a location guardrail to restrict external access to only IP addresses within the partner’s network. As you can see, you have a lot of granularity in how you deploy the guardrails. But stay focused on getting quick wins up front, so don’t try to boil the

Share:
Read Post

Firestarter: re:Invent Security Review

It’s that time of year again. The time when Amazon takes over our lives. No, not the holiday shopping season but the annual re:Invent conference where Amazon Web Services takes over Las Vegas (really, all of it) and dumps a firehouse of updates on the world. Listen in to hear our take on new services like Transit Hub, Security Hub, and Control Tower. Watch or listen: Share:

Share:
Read Post

DisruptOps: Something You Probably Should Include When Building Your Next Threat Models

Something You Probably Should Include When Building Your Next Threat Models We are working on our threat modeling here at DisruptOps and I decided to refresh my knowledge of different approaches. One thing that quickly stood out is that nearly none of the threat modeling documentation or tools I’ve seen cover the CI/CD pipeline. Read the full post at DisruptOps Share:

Share:
Read Post

DisruptOps: Three of the Most Crucial Sections of the DevSecOps Roadmap

Three of the Most Crucial Sections of the DevSecOps Roadmap As I mentioned in the (DevSec)Ops vs. Dev(SecOps) post, we’ve been traveling around to a couple of DevOpsDays conferences doing the Quick and Dirty DevSecOps talk. One of the things I tend to start with early in the talk is that like DevOps, DevSecOps is not a product. Or something you can deploy and forget. It’s a cultural change. It’s a process. It’s a journey. Read the full post at DisruptOps Share:

Share:
Read Post

Protecting What Matters: Defining Data Guardrails and Behavioral Analytics

This is the second post in our series on Protecting What Matters: Introducing Data Guardrails and Behavioral Analytics. Our first post, Introducing Data Guardrails and Behavioral Analytics: Understand the Mission, introduced the concepts and outlined the major categories of insider risk. This post defines the concepts. Data security has long been the most challenging domain of information security, despite being the centerpiece of our entire practice. We only call it “data security” because “information security” was already taken. Data security must not impede use of the data itself. By contrast it’s easy to protect archival data (encrypt it and lock the keys up in a safe). But protecting unstructured data in active use by our organizations? Not so easy. That’s why we started this research by focusing on insider risks, including external attackers leveraging insider access. Recognizing someone performing an authorized action, but with malicious intent, is a nuance lost on most security tools. How Data Guardrails and Data Behavioral Analytics are Different Both data guardrails and data behavioral analytics strive to improve data security by combining content knowledge (classification) with context and usage. Data guardrails leverage this knowledge in deterministic models and processes to minimize the friction of security while still improving defenses. For example, if a user attempts to make a file in a sensitive repository public, a guardrail could require them to record a justification and then send a notification to Security to approve the request. Guardrails are rule sets that keep users “within the lines” of authorized activity, based on what they are doing. Data behavioral analytics extends the analysis to include current and historical activity, and uses tools such as artificial intelligence/machine learning and social graphs to identify unusual patterns which bypass other data security controls. Analytics reduces these gaps by looking not only at content and simple context (as DLP might), but also adding in history of how that data, and data like it, has been used within the current context. A simple example is a user accessing an unusual volume of data in a short period, which could indicate malicious intent or a compromised account. A more complicated situation would identify sensitive intellectual property on an accounting team device, even though they do not need to collaborate with the engineering team. This higher order decision making requires an understanding of data usage and connections within your environment. Central to these concepts is the reality of distributed data actively used widely by many employees. Security can’t effectively lock everything down with strict rules covering every use case without fundamentally breaking business processes. But with integrated views of data and its intersection with users, we can build data guardrails and informed data behavioral analytical models, to identify and reduce misuse without negatively impacting legitimate activity. Data guardrails enforce predictable rules aligned with authorized business processes, while data behavioral analytics look for edge cases and less predictable anomalies. How Data Guardrails and Data Behavioral Analytics Work The easiest way to understand the difference between data guardrails and data behavioral analytics is that guardrails rely on pre-built deterministic rules (which can be as simple as “if this then that”), while analytics rely on AI, machine learning, and other heuristic technologies which look at patterns and deviations. To be effective both rely on the following foundational capabilities: A centralized view of data. Both approaches assume a broad understanding of data and usage – without a central view you can’t build the rules or models. Access to data context. Context includes multiple characteristics including location, size, data type (if available), tags, who has access, who created the data, and all available metadata. Access to user context, including privileges (entitlements), groups, roles, business unit, etc. The ability to monitor activity and enforce rules. Guardrails, by nature, are preventative controls which require enforcement capabilities. Data behavioral analytics can be used only for detection, but are far more effective at preventing data loss if they can block actions. The two technologies then work differently while reinforcing each other: Data guardrails are sets of rules which look for specific deviations from policy, then take action to restore compliance. To expand our earlier example: A user shares a file located in cloud storage publicly. Let’s assume the user has the proper privileges to make files public. The file is in a cloud service so we also assume centralized monitoring/visibility, as well as the capability to enforce rules on that file. The file is located in an engineering team’s repository (directory) for new plans and projects. Even without tagging, this location alone indicates a potentially sensitive file. The system sees the request to make the file public, but because of the context (location or tag), it prompts the user to enter a justification to allow the action, which gets logged for the security team to review. Alternatively, the guardrail could require approval from a manager before allowing the action. Guardrails are not blockers because the user can still share the file. Prompting for user justification both prevents mistakes and loops in security review for accountability, allowing the business to move fast while minimizing risk. You could also look for large file movements based on pre-determined thresholds. A guardrail would only kick in if the policy thresholds are violated, and then use enforcement actions aligned with business processes (such as approvals and notifications) rather than simply blocking activity and calling in the security goons. Data behavioral analytics use historical information and activity (typically with training sets of known-good and known-bad activity), which produce artificial intelligence models to identify anomalies. We don’t want to be too narrow in our description, because there are a wide variety of approaches to building models. Historical activity, ongoing monitoring, and ongoing modeling are all essential – no matter the mathematical details. By definition we focus on the behavior of data as the core of these models, rather than user activity; this represents a subtle but critical distinction from User Behavioral Analytics (UBA). UBA tracks activity on a per-user basis. Data behavioral analytics (the acronym DBA is already taken, so we’ll

Share:
Read Post

Building a Multi-cloud Logging Strategy: Issues and Pitfalls

As we begin our series on Multi-cloud logging, we start with reasons some traditional logging approaches don’t work. I don’t like to start with a negative tone, but we need to point out some challenges and pitfalls which often beset firms on first migration to cloud. That, and it helps frame our other recommendations later in this series. Let’s take a look at some common issues by category. Tooling Scale & Performance: Most log management and SIEM platforms were designed and first sold before anyone had heard of clouds, Kafka, or containers. They were architected for ‘hub-and-spoke’ deployments on flat networks, when ‘Scalability’ meant running on a bigger server. This is important because the infrastructure we now monitor is agile – designed to auto-scale up when we need processing power, and back down to reduce costs. The ability to scale up, down, and out is essential to the cloud, but often missing from older logging products which require manual setup, lacking full API enablement and auto-scale capability. Data Sources: We mentioned in our introduction that some common network log sources are unavailable in the cloud. Contrawise, as automation and orchestration of cloud resources are via API calls, API logs become an important source. Data formats for these new log sources may change, as do the indicators used to group events or users within logs. For example servers in auto-scale groups may share a common IP address. But functions and other ‘serverless’ infrastructure are ephemeral, making it impossible to differentiate one instance from the next this way. So your tools need to ingest new types of logs, faster, and change their threat detection methods by source. Identity: Understanding who did what requires understandings identity. An identity may be a person, service, or device. Regardless, the need to map it, and perhaps correlate it across sources, becomes even more important in hybrid and multi-cloud environments Volume: When SIEM first began making the rounds, there were only so many security tools and they were pumping out only so many logs. Between new security niches and new regulations, the array of log sources sending unprecedented amounts of logs to collect and analyze grows every year. Moving from traditional AV to EPP, for example, brings with it a huge log volume increase. Add in EDR logs and you’re really into some serious volumes. On the server side, moving from network and server logs to add application layer and container logs brings a non-trivial increase in volume. There are only so many tools designed to handle modern event rates (X billio events per day) and volumes (Y terabytes per day) without buckling under the load, and more importantly, there are only so many people who know how to deploy and operate them in production. While storage is plentiful and cheap in the cloud, you still need to get those logs to the desired storage from various on-premise and cloud sources – perhaps across IaaS, PaaS, and SaaS. If you think that’s easy call your SaaS vendor and ask how to export all your logs from their cloud into your preferred log store (S3/ADLS/GCS/etc.). That old saw from Silicon Valley, “But does it scale?” is funny but really applies in some cases. Bandwidth: While we’re on the topic of ridiculous volumes, let’s discuss bandwidth. Network bandwidth and transport layer security between on-premise and cloud and inter-cloud is non-trivial. There are financial costs, as well as engineering and operational considerations. If you don’t believe me ask your AWS or Azure sales person how to move, say, 10 terabytes a day between those two. In some cases architecture only allows a certain amount of bandwidth for log movement and transport, so consider this when planning migrations and add-ons. Structure Multi-account Multi-cloud Architectures: Cloud security facilitates things like micro-segmentation, multi-account strategies, closing down all unnecessary network access, and even running different workloads in different cloud environments. This sort of segmentation makes it much more difficult for attackers to pivot if they gain a foothold. It also means you will need to consider which cloud native logs are available, what you need to supplement with other tooling, and how you will stitch all these sources together. Expecting to dump all your events into a syslog style service and let it percolate back on-premise is unrealistic. You need new architectures for log capture, filtering, and analysis. Storage is the easy part. Monitoring “up the Stack”: As cloud providers manage infrastructure, and possibly applications as well, your threat detection focus must shift from networks to applications. This is both because you lack visibility into network operations, but also because cloud network deployments are generally more secure, prompting attackers to shift focus. Even if you’re used to monitoring the app layer from a security perspective, for example with a big WAF in front of your on-premise servers, do you know whether you vendor has a viable cloud offering? If you’re lucky enough to have one that works in both places, and you can deploy in cloud as well, answer this (before you initiate the project): Where will those logs go, and how will you get them there? Storage vs. Ingestion: Data storage in cloud services, especially object storage, is so cheap it is practically free. And long-term data archival cloud services offer huge cost advantages over older on-premise solutions. In essence we are encouraged to store more. But while storage is cheap, it’s not always cheap to ingest more data into the cloud because some logging and analytics services charge based upon volume (gigabytes) and event rates (number of events) ingested into the tool/service/platform. Example are Splunk, Azure Eventhubs, AWS Kinesis, and Google Stackdriver. Many log sources for the cloud are verbose – both number of events and amount of data generated from each. So you will need to architect your solution to be economically efficient, as well as negotiate with your vendors over ingestion of noisy sources such as DNS and proxies, for example. A brief side note on ‘closed’ logging pipelines: Some vendors want to own your logging pipeline on top of your analytics toolset. This may

Share:
Read Post

DisruptOps: The 4 Phases to Automating Cloud Management

A Security Pro’s Cloud Automation Journey Catch me at a conference and the odds are you will overhear my saying “cloud security starts with architecture and ends with automation.” I quickly follow with how important it is to adopt a cloud native mindset, even when you’re bogged down with the realities of an ugly lift and shift before the data center contract ends and you turn the lights off. While that’s a nice quip, it doesn’t really capture anything about how I went from a meat and potatoes (firewall and patch management) kind of security pro to an architecture and automation and automation cloud native. Rather than preaching from the mount, I find it more useful to describe my personal journey and my technical realizations along the way. If you’re a security pro, or someone trying to up-skill a security pro for cloud, odds are you will end up on a very similar path. Read the full post at DisruptOps Share:

Share:
Read Post

DAM Not Moving to the Cloud

I have concluded that nobody is using Database Activity Monitoring (DAM) in public Infrastructure or Platform as a Service. I never see it in any of the cloud migrations we assist with. Clients don’t ask about how to deploy it or if they need to close this gap. I do not hear stories, good or bad, about its usage. Not that DAM cannot be used in the cloud, but it is not. There are certainly some reasons firms invest security time and resources elsewhere. What comes to mind are the following: PaaS and use of Relational: There are a couple trends which I think come into play. First, while user installed and managed relational databases do happen, there is a definite trend towards adopting RDBMS as a Service. If customers do install their own relational platform, it’s MySQL or MariaDB, for which (so far as I know) there are few monitoring options. Second, for most new software projects, a relational database is a much less likely choice to back applications – more often it’s a NoSQL platform like Mongo (self-managed) or something like Dynamo. This has reduced the total relational footprint. CI:CD: Automated build and security test pipelines – we see a lot more application and database security testing in development and quality assurance phases, prior to production deployment. Many potential code vulnerabilities and common SQL injection attacks are being spotted and addressed prior to applications being deployed. And there may not be a lot of reconfiguration in production if your installation is defined in software. Network Security: Between segmentation, firewalls/security groups, and port management you can really lock down the (virtual) network so only the application can talk to the database. Difficult for anyone to end-run around if properly set up. Database Ownership: Some people cling to the misconception that the database is owned and operated by the cloud provider, so they will take care of database security. Yes, the vendor handles lots of configuration security and patching for you. Certainly much of the value of a DAM platform, namely security assessment and detection of old database versions, is handled elsewhere. Permission misuse is harder. Most IaaS clouds offer dynamic policy-driven IAM. You can set very fine-grained access controls on database access, so you can block many types of ad hoc and potentially malicious queries. Maybe none of these reasons? Maybe all the above? I don’t really know. Regardless, DAM has not moved to the cloud. The lack of interest does not provide any real insights as to why, but it is very clear. I do still want some of DAM’s monitoring functions for cloud migrations, specifically looking for SQL injection attacks – which are still your issue to deal with – as well as looking for credential misuse, such as detecting too much data transfer or scraping. Cloud providers log API access to the database installation, and there are cloud-native ways to perform assessment. But on the monitoring side there are few other options for watching SQL queries. Share:

Share:
Read Post

DisruptOps: Consolidating Config Guardrails with Aggregators

Disrupt:Ops: Consolidating Config Guardrails with Aggregators In Quick and Dirty: Building an S3 guardrail with Config we highlighted that one of the big problems with Config is you need to build it in all regions of all accounts separately. Now your best bet to make that manageable is to use infrastructure as code tools like CloudFormation to replicate your settings across environments. We have a lot more to say on scaling out baseline security and operations settings, but for this post I want to highlight how to aggregate Config into a unified dashboard. Read the full post at DisruptOps Share:

Share:
Read Post
dinosaur-sidebar

Totally Transparent Research is the embodiment of how we work at Securosis. It’s our core operating philosophy, our research policy, and a specific process. We initially developed it to help maintain objectivity while producing licensed research, but its benefits extend to all aspects of our business.

Going beyond Open Source Research, and a far cry from the traditional syndicated research model, we think it’s the best way to produce independent, objective, quality research.

Here’s how it works:

  • Content is developed ‘live’ on the blog. Primary research is generally released in pieces, as a series of posts, so we can digest and integrate feedback, making the end results much stronger than traditional “ivory tower” research.
  • Comments are enabled for posts. All comments are kept except for spam, personal insults of a clearly inflammatory nature, and completely off-topic content that distracts from the discussion. We welcome comments critical of the work, even if somewhat insulting to the authors. Really.
  • Anyone can comment, and no registration is required. Vendors or consultants with a relevant product or offering must properly identify themselves. While their comments won’t be deleted, the writer/moderator will “call out”, identify, and possibly ridicule vendors who fail to do so.
  • Vendors considering licensing the content are welcome to provide feedback, but it must be posted in the comments - just like everyone else. There is no back channel influence on the research findings or posts.
    Analysts must reply to comments and defend the research position, or agree to modify the content.
  • At the end of the post series, the analyst compiles the posts into a paper, presentation, or other delivery vehicle. Public comments/input factors into the research, where appropriate.
  • If the research is distributed as a paper, significant commenters/contributors are acknowledged in the opening of the report. If they did not post their real names, handles used for comments are listed. Commenters do not retain any rights to the report, but their contributions will be recognized.
  • All primary research will be released under a Creative Commons license. The current license is Non-Commercial, Attribution. The analyst, at their discretion, may add a Derivative Works or Share Alike condition.
  • Securosis primary research does not discuss specific vendors or specific products/offerings, unless used to provide context, contrast or to make a point (which is very very rare).
    Although quotes from published primary research (and published primary research only) may be used in press releases, said quotes may never mention a specific vendor, even if the vendor is mentioned in the source report. Securosis must approve any quote to appear in any vendor marketing collateral.
  • Final primary research will be posted on the blog with open comments.
  • Research will be updated periodically to reflect market realities, based on the discretion of the primary analyst. Updated research will be dated and given a version number.
    For research that cannot be developed using this model, such as complex principles or models that are unsuited for a series of blog posts, the content will be chunked up and posted at or before release of the paper to solicit public feedback, and provide an open venue for comments and criticisms.
  • In rare cases Securosis may write papers outside of the primary research agenda, but only if the end result can be non-biased and valuable to the user community to supplement industry-wide efforts or advances. A “Radically Transparent Research” process will be followed in developing these papers, where absolutely all materials are public at all stages of development, including communications (email, call notes).
    Only the free primary research released on our site can be licensed. We will not accept licensing fees on research we charge users to access.
  • All licensed research will be clearly labeled with the licensees. No licensed research will be released without indicating the sources of licensing fees. Again, there will be no back channel influence. We’re open and transparent about our revenue sources.

In essence, we develop all of our research out in the open, and not only seek public comments, but keep those comments indefinitely as a record of the research creation process. If you believe we are biased or not doing our homework, you can call us out on it and it will be there in the record. Our philosophy involves cracking open the research process, and using our readers to eliminate bias and enhance the quality of the work.

On the back end, here’s how we handle this approach with licensees:

  • Licensees may propose paper topics. The topic may be accepted if it is consistent with the Securosis research agenda and goals, but only if it can be covered without bias and will be valuable to the end user community.
  • Analysts produce research according to their own research agendas, and may offer licensing under the same objectivity requirements.
  • The potential licensee will be provided an outline of our research positions and the potential research product so they can determine if it is likely to meet their objectives.
  • Once the licensee agrees, development of the primary research content begins, following the Totally Transparent Research process as outlined above. At this point, there is no money exchanged.
  • Upon completion of the paper, the licensee will receive a release candidate to determine whether the final result still meets their needs.
  • If the content does not meet their needs, the licensee is not required to pay, and the research will be released without licensing or with alternate licensees.
  • Licensees may host and reuse the content for the length of the license (typically one year). This includes placing the content behind a registration process, posting on white paper networks, or translation into other languages. The research will always be hosted at Securosis for free without registration.

Here is the language we currently place in our research project agreements:

Content will be created independently of LICENSEE with no obligations for payment. Once content is complete, LICENSEE will have a 3 day review period to determine if the content meets corporate objectives. If the content is unsuitable, LICENSEE will not be obligated for any payment and Securosis is free to distribute the whitepaper without branding or with alternate licensees, and will not complete any associated webcasts for the declining LICENSEE. Content licensing, webcasts and payment are contingent on the content being acceptable to LICENSEE. This maintains objectivity while limiting the risk to LICENSEE. Securosis maintains all rights to the content and to include Securosis branding in addition to any licensee branding.

Even this process itself is open to criticism. If you have questions or comments, you can email us or comment on the blog.