Threat modeling, the cloud, and shared responsibility

An interesting aspect of cloud-related threat models is that cloud-based threats must take into account shared responsibility models that are specific to each cloud provider and service.

If a key output of any threat modeling exercise is a set of identified threats, then the ideal state for any threat is that you eliminate it completely by way of design, engineering, or otherwise. Of course, the value of threat modeling is that you not only identify threats that you can eliminate, but that you make thoughtful decisions about how to deal with the remaining threats that you cannot.

A model for this surely exists, but I wanted something very simple that could be used to classify unavoidable threats based on the two primary means of mitigating them: technical controls and monitoring. This has obvious utility when threat modeling applications built atop cloud platforms, but is similarly applicable when building atop any platform that you don’t fully control (hint: this is virtually all of them).

alt

In this model, threats end up in one of three states:

  • Green, which is as good as it gets for a threat you can’t eliminate outright. Of course, if controls are available, there’s a good question to be asked re: whether those controls can be implemented such (i.e., by using restrictive defaults or policies) that the threat is eliminated and thus removed from this grid entirely.
  • Yellow, which is probably the most common. In this state, you’re able to rely on either technical controls or on monitoring. The trick with relying solely on monitoring to mitigate a threat is that monitoring is only an effective mitigation when coupled with detection (knowing the threat occurred) and response (doing something about it).
  • Red, which should leave you questioning your design, your cloud provider, or both. In particular, threats in this state require putting significant trust in both your cloud provider and the security inherent to their platform, as well as your ability to engineer for safety.

Discussion on LinkedIn

Roundup of threat modeling resources

I don’t know much about threat modeling, except that as long as I’ve been working in cybersecurity, we’ve been asking people about their threat model, telling them to do threat modeling, and in the worst cases using greedy threat models to convince folks that they should prioritize things that they probably should not.

I believe that, like most things in this industry, there’s a very approachable and “plenty good enough” version of threat modeling that would benefit a good enough percentage of organizations. In an effort to figure that out, I came across this handful of resources covering core threat modeling concepts and frameworks, real-world threat models, and tools you might find useful.

Learning about threat modeling

Adam Shostack

The canonical reference is Threat Modeling: Designing for Security by Adam Shostack. If you’re interested in threat modeling, you should read it. I’ve tried to read or page through a couple of other books, and frankly none of them compare.

Shostack also provides a beginner’s guide, which is less instructive and more of an overview of the purpose of threat modeling and a roundup of models and methodologies. This guide perfectly frames why we threat model by way of the specific questions that we should be trying to answer, using his Four Question Framework:

  • What are we working on?
  • What can go wrong?
  • What are we going to do about it?
  • Did we do a good job?

If you’re in search of “plenty good enough” threat modeling, answering these four questions relative to your subject system or application is a really great place to start.

Kevin Riggle, for Increment (Stripe)

The title and introduction to An introduction to approachable threat modeling say a lot:

Threat modeling is one of the most important parts of the everyday practice of security, at companies large and small. It’s also one of the most commonly misunderstood. Whole books have been written about threat modeling, and there are many different methodologies for doing it, but I’ve seen few of them used in practice. They are usually slow, time-consuming, and require a lot of expertise.

Riggle proposes a four-question framework based on Principals, Goals, Adversities, and Invariants:

  • What is the system, and who cares about it?
  • What does it need to do?
  • What bad things can happen to it through bad luck, or be done to it by bad people?
  • What must be true about the system so that it will still accomplish what it needs to accomplish, safely, even if those bad things happen to it?

The examples are few, clear, and illustrative.

Additional learning resources

Threat modeling examples

I learn best by example, and I’ve found that threat modeling examples–the outputs of the process–can be hard to come by.

Related: I threw together a free threat modeling template based on some of these examples.

AWS KMS, by Costas Kourmpoglou / Airwalk Reply

This excellent AWS KMS Threat Model leverages security model and product documentation from AWS and identifies a tabular list of threats.

alt

For decision-making purposes, this flowchart makes it easy to determine which key management option you should choose based on standard trust boundaries:

alt

Google Cloud Storage, by NCC Group

One of the best cloud infrastructure (IaaS or PaaS) threat models I’ve seen is by Ken Wolstencroft and the folks at NCC Group: Threat Modelling Cloud Platform Services by Example: Google Cloud Storage. This is a good reference for anyone looking to implement a STRIDE-based threat model.

alt

AWS ECS Fargate threat model by Sysdig

This ECS Fargate threat model by Sysdig provides output in the form of a table:

  • Assets
  • Threats
  • Security controls
  • Mitigations

Cloud environments are often a rat’s nest of services, so I like the idea of being able to “stack” threat models, and do things like identify a unique set of security controls or mitigations.

TLS v1.2 threat model

This may be one of the most concise representations of a threat model available, and is an excellent example of an ideal final output of a threat modeling exercise: A prose explanation of the purpose of the application, assumptions about operator and attackers, and how the application design addresses threats (which may include acknowledging a threat but not attempting to mitigate it at all).

The TLS protocol is designed to establish a secure connection between a client and a server communicating over an insecure channel. This document makes several traditional assumptions, including that attackers have substantial computational resources and cannot obtain secret information from sources outside the protocol. Attackers are assumed to have the ability to capture, modify, delete, replay, and otherwise tamper with messages sent over the communication channel.

See RFC 5246 - The Transport Layer Security (TLS) Protocol Version 1.2, Appendix F (Security Analysis) for the remainder of the analysis.

Enterprise AI search tools

Kane Narraway does a great job easing into a threat model for enterprise AI search by starting with key tradeoffs and decisions, and then presenting an organized threat model and corresponding diagram.

Other threat modeling examples

Notably, there are some threat models that have been codified as their own RFCs:

And these are collections of a wide variety of threat models and threat model formats:

Tools for threat modeling

And of course, everyone loves to software their way out of or through a process problem, so here’s some software that came recommended.

General purpose tools

End-to-end tools for threat modeling.

Diagraming and visualization tools

Useful for diagraming, visualizing, and collaborating on threat models.

Templates

Useful for capturing the output of threat modeling exercises.

Alternatives to threat modeling

One thing that’s apparent from all of the above is that there’s no established means of threat modeling, and certainly no “right” way. Considering threats and making purposeful decisions regarding whether or how you’ll mitigate them is better than doing nothing, and more important than any set of processes or artifacts. That said, using an established process and repeating it will mean that you mature your understanding of both the systems and the threats, so it’s worth doing consistently and well.

Mozilla Rapid Risk Assessment (RRA)

Of course, sometimes what you need is a fast, good enough risk assessment. The Mozilla Rapid Risk Assessment (RRA) process is intended to be very fast (~30 minutes).

One thing that I appreciate is the simplicity of their list of issues and corresponding impacts. When defining any type of risk or incident management policy, it’s common to get hung up on questions like “what’s the difference between medium and high impact?” This list is a convenient shortcut and will speed up risk classification and determination of impact.

alt

Information asymmetry is the root cause of every breach

Information asymmetry is the root cause of every breach. Your adversary knows something that you do not.

For instance, your adversary might discover:

  • A remote management system exposed to the Internet that you haven’t adequately protected
  • An account that uses weak, legacy authentication (e.g., an account lacking multi-factor authentication
  • A software vulnerability that you don’t know about, or know about but haven’t mitigated through patching or otherwise

Of course, this aspect of asymmetry doesn’t just apply to weaknesses on the target side. Your adversary likely has some level of specialization that works to their advantage, and that you can’t effectively predict. This might include deep understanding of:

  • One or more particular pieces of software
  • The inner workings of underlying compute and/or infrastructure platforms
  • The limitations of security controls or mitigations

Apex adversaries often have a deep bench, able to tap individuals or teams with specialization in myriad technical, operational, and other disciplines. You should assume that these adversaries either check every box on the lists above, or have the resources to do so.

Information asymmetry applies to intrusion operations, too.

Your adversary has the benefit of initiative. In addition to knowing technical, operational, and other aspects of intrusion operations, they are able to:

  • Plan and coordinate amongst themselves and their partners
  • Be as patient as needed
  • Balance things like the urgency of their operations against their risk tolerance (where risks are things like outright failure, attribution, and more)

So what’s the good news? If you can identify points of asymmetry, you can counter them.

We love to tell people to “think like an adversary”. Thinking like an adversary isn’t really a thing. It’s a few things, coupled with intent. But most of what adversaries are doing is looking for points of asymmetry and exploiting them.

Look at the lists above, and then consider how you might approach the same points of asymmetry, but to a different end. Selected examples:

  • Double down on understanding your attack surface, particular the subset that is most exposed
  • Provide strong identity protection, at virtually any acceptable cost
  • Patch your software, focusing first on vulnerabilities that have been actively exploited (don’t forget that your security controls are attack surface, too!)

Identify areas where asymmetry exists, and do a little work to try to learn the things that an adversary would seek to learn.

Get there first.

The top initial access vectors in 2022, mapped to ATT&CK

This data is from 2022. For data from 2023 (the most recent), please go HERE.

In reviewing security firms’ 2022 threat data, a subset of these include insight into the initial access vectors leveraged most frequently in successful intrusions. This is a summarization of findings based on their reporting.

alt

Rank MITRE ATT&CK Technique ID Vector Percentage
1 T1566 Phishing 42.9%
2 T1190 Exploit Public-Facing Application 31.7%
3 T1189 Drive-By Compromise 9.5%
3 T1133 Valid Accounts 9.5%
4 T1078 External Remote Services 4.8%
5 T1195 Supply Chain Compromise 1.6%

Methodology

To determine the most prevalent initial access techniques leveraged by adversaries in 2022, I relied on data from the following reports:

Because not all of these reports use a standard taxonomy, reported vectors were mapped to the corresponding MITRE ATT&CK Initial Access parent technique.

As with all threat reports, the findings and prevalence are subject to each firms’ visibility and methodology.

How to use this information

From my earlier thoughts on this matter:

A good use case for these types of lists–and a way to make them actionable–is to look at tactics starting with initial access and progressing through the intrusion lifecycle. For each tactic, look for common vectors and MITRE ATT&CK techniques (some of this is readily available in the source reports below). The goal is to see whether we can glean good enough insights and do it quickly, assess risks, and take preventative measures.

A company with a formal org chart is a company big enough to have an informal org chart that accurately describes how things actually get done. Permalink

From Byrne Hobart at The Diff, a list of scaling bottlenecks often encountered by startups: Recruiting, decision-making, management, and more.

[O]ne of the biggest scaling bottlenecks ends up being the related problems of asymmetric information and decision fatigue. In a small company with a flat corporate hierarchy, information travels fast. If people are working long hours in close proximity, it’s almost impossible for anything to stay secret. And if everyone’s either a founder or a direct report of one, there isn’t much room for politics. A company with a formal org chart is a company big enough to have an informal org chart that accurately describes how things actually get done. Whether this is described as “politics” or as “effective” partly depends on people’s relative positions in both. And that adds an inescapable tax to growth: more people means more conflicting interests, and more cases where the right choice for the company as a whole conflicts with the right choice for individuals.

On delegation in particular, very much a limiting factor for learning and growth:

Effective delegation can be best defined in negative terms: a manager is not delegating unless their subordinates at least occasionally make exactly the opposite decision from the one their manager would have made.