Observability as a function of your threat model

2 minute read

This model attempts to explain the relationship between visibility, observability, detection, and mitigation, which are closely related and important to understand when implementing any information security or cybersecurity program.

alt

Of course, having a model is not the same as having an implementation or an operational capability. And in implementing any program based on a model, it’s very easy to fall into the trap of using the model as a bingo card, treating each component equally and in doing so expending resources in a way that is inefficient or ineffective.

In considering how to approach these four components, it might help to bucket them based on:

  • Their primary inputs or drivers
  • Any limiting factors that they have in common

alt

Visibility is a function of your attack surface. Some key inputs here include facilities, networks, infrastructure, and service providers. For each of these, you’re wise to invest in as much visibility as you can. Keep in mind that visibility is knowing that an asset exists, but not necessarily knowing everything about it, how it’s used, or actively monitoring it. You can get a lot of mileage out of good accounting related to identities, devices, applications, etc.

Observability, detection, and mitigation are functions of your use cases. Use cases will include things like the needs of your operational teams, including technical and security operations. They may need more or less of these things in order to maintain systems, detect threats, investigate incidents, and generally mitigate threats or other types of problems.

Observability is the most interesting of the three, because there is so much debate (and marketing) surrounding the types and levels of observability that you need. Of course, a vendor that makes money when you feed it log, security, or other event data–or one who sells you products that produce those signals–is going to make a case that you need as much observability as you can afford. From a security standpoint, I would argue that observability should be driven largely by your threat model, and your understanding of where you need to be in order to observe, detect, and respond effectively to threats.

An implementation of this approach might look something like this:

Activity Sample Resources
Understand the threats that we face based on the technology that we use, the data that we have, and other factors. Industry threat reports are great sources of this information:

Red Canary’s Threat Detection Report
CrowdStrike’s Global Threat Report
Mandiant’s M-Trends
Identify the techniques that those threats leverage, which can be found using readily available industry reporting. Leverage the above coupled with other open source reporting, and use MITRE ATT&CK as a model for enumerating specific techniques:

https://attack.mitre.org/techniques/enterprise/
From the relevant ATT&CK techniques, identify the data sources that lead to observability of these techniques.

Pro tip: You’ll likely find that a small number of data sources have an outsized impact on observability coverage (i.e., a few data sources means that you can see a great percentage of techniques).
Each ATT&CK technique includes the specific data sources that can be used to observe it:

https://attack.mitre.org/datasources/

You can go wild with this implementation, but you can also keep it relatively simple and get great results. The guiding principles are that you want to ensure the broadest possible visibility, so that you understand your attack surface and the assets that are at risk, and you want to implement the right level of observability to meet adversaries where they operate, which is where you need to be in order to detect, respond, and mitigate related threats.

The future of cybersecurity might be insurance

3 minute read

For some time now, I’ve been considering a hypothesis that the future of cybersecurity is some form of vertically integrated set of products, services, and insurance. This won’t represent emerging or niche cybersecurity products and services, but will bring actuarial rigor to identification and measurement of the outcomes that cybersecurity vendors claim to provide, and so it will represent the subset of offerings that provide consistent, provable value (i.e., things that reliably mitigate high impact threats). The primary consumer benefit will be a faster path to implementation of a plenty good enough cybersecurity portfolio for a large percentage of organizations.

Enter the National Cybersecurity Strategy

With the release of the United States 2023 National Cybersecurity Strategy, there’s been much ado about this strategic objective:

STRATEGIC OBJECTIVE 3.6: EXPLORE A FEDERAL CYBER INSURANCE BACKSTOP

When catastrophic incidents occur, it is a government responsibility to stabilize the economy and provide certainty in uncertain times. In the event of a catastrophic cyber incident, the Federal Government could be called upon to stabilize the economy and aid recovery. Structuring that response before a catastrophic event occurs-rather than rushing to develop an aid package after the fact could provide certainty to markets and make the nation more resilient. The Administration will assess the need for and possible structures of a Federal insurance response to catastrophic cyber events that would support the existing cyber insurance market. In developing this assessment, the Administration will seek input from, and consult with, Congress, state regulators, and industry stakeholders.

Equally important is the drive to improve development practices by limiting the degree to which vendors can absolve themselves of liability: “Any such legislation should prevent manufacturers and software publishers with market power from fully disclaiming liability by contract, and establish higher standards of care for software in specific high-risk scenarios,”

For those who have been following along at home, the Federal Government has been doing its homework and soliciting input on this for years. Notably, this research broadened with a call from the Government Accountabiliy Office in June 2022, urging that “Treasury and Homeland Security jointly assess if a federal response is needed to address the situation.” This was followed by a formal request for comment from the United States Treasury in 2022, titled “Potential Federal Insurance Response to Catastrophic Cyber Incidents”. So, the announcements in the National Cybersecurity Strategy are important but not surprising.

Back to private industry

The cyber insurance industry has swung like a pendulum within the last decade. What was once cheap and plentiful and underwritten with a low level of rigor has led to substantial losses, and relatively quickly become a web of sub-insurance requirements and narrowing coverage. So, there will be a lot of momentum and buzz about the prospect of the government stepping in to provide some stabilization. But there are also plenty of potential, or even likely, downsides including:

  • Moral hazard - There’s been plenty of anecdotal correlation between the tightening market for cyber insurance and increased investment and attention on behalf of organizations in their cybersecurity posture. Knowing that the government will be there to bail them out—whether directly or through their insurers—may stymie this progress.
  • Absence of catastrophic economic risk - Directly from Lawfare: “There is no evidence that firms are halting online economic activity because of either low cyber insurance limits or the introduction of new war clauses. It is simply unthinkable that retail firms would shut down websites and rely on brick and mortar stores because of changes in cyber insurance coverage. The impact of the digital age—and reliance on the internet—is simply too strong.”

Clear opportunity

There’s plenty more to be considered. But the sheer magnitude and complexity of the economics of cybercrime, in particular how we insure against losses, and the expansive but not yet outcome-bound cybersecurity solution landscape, are all great indicators of opportunity. A marketplace for offerings with tightly aligned incentives will advance both the insurance and cybersecurity industries in a meaningful way, raising the standards of cybersecurity posture at-large, in turn alleviating some of this public and private market pressure.

Publish your Twitter archive in seconds Permalink

1 minute read

A simple, elegant application from Darius Kazemi (GitHub, Website) that runs locally via your web browser and takes as input:

  • The URL where you plan to publish the archive
  • The ZIP file downloaded from Twitter in response to your archive request

The output is static, searchable archive of your public Tweets and your replies to those Tweets that’s easy to publish on your own website in no time at all.

alt

Since your Twitter archive contains more than just your public Tweets, the tool implements sensible and privacy-preserving defaults:

NO uploading happens in this entire process. Your data stays with you, on the computer where you are running this. When you attach your archive it stays on your computer, running in a local browser app.

The archive that it spits out will contain your original, public tweets, and any replies that you made to tweets you have authored (so, threads that you made).

The archive that it spits out will contain no DMs or even any public replies to other people’s content (these public replies lack context and I didn’t want to open that ethical can of worms here).

The archive will not contain any private media such as photos you sent via DM.

The archive will not contain “circles”, “moments”, “co-tweets”, “spaces”, or any other non-basic or non-public tweet content.

The source is available on GitHub: https://github.com/dariusk/twitter-archiver.

Incidents: An organizational Swiss Army knife

2 minute read

Incidents may be one of the best measures of maturity, effectiveness, and progress in any highly operational environment, including but not limited to security operations and technology operations (including site reliability engineering, or SRE). However, incident management done right can be an invaluable tool that you can point at virtually any problem- or failure-prone system to make it better.

What you can learn from your incidents

If you have defined incident severity levels coupled with the most basic incident management practices–tracking and classifying incidents, and handling them with some consistency–they will quickly become an invaluable way to learn and measure:

  • what’s going wrong
  • where it’s going wrong
  • how often it’s going wrong

If you’re doing more mature incident management, and in particular if you’re performing post-incident analysis, your incident data should help you understand:

  • why things are going wrong (root causes)
  • where things are going wrong repeatedly (patterns or hot spots)
  • signs that something is about to go wrong (leading indicators)
  • how adept you are at responding when things go wrong (performance)
  • whether you’re learning from and continuously improving based on all of the above (trending)

These are overly simplistic, but very representative of the things you can expect to learn from your incidents.

Continuous improvement through incident management

Start small, and report regularly on as many aspects of your incidents as you can. If all that you know is 1) that an incident occurred and 2) you can classify the incident based on defined severity levels, then you can start to report on this information, driving transparency and discussion. Make your first goal being able to account for incidents and share high-level data amongst your team.

Over time, you can mature your overall incident management practices. As you begin to perform more frequent and more thorough post-incident analysis, you can do things like:

  • capture playbooks to make your response to classes of incidents more repeatable
  • set goals to address areas of importance, ranging from things like improving your ability to observe system state and detect incidents in the first place to improving the performance of your response teams
  • evaluate trends and set thresholds for different levels of escalation and prioritization

Using incidents to build resilience (in creative ways)

One neat thing about incidents is that they can and will be defined based on the types of things that you care about controlling. Some common types of incidents are security incidents (e.g., intrusions or insider threats) and operational incidents (e.g., outages or system degradation). In global organizations like airlines, incident management may involve detecting and responding to anything ranging from personnel availability to severe weather to geopolitical events.

However, if you’re experiencing issues related to cost, you may declare an incident when certain cost-driving events, such as wild auto-scaling events or excessive data ingestion take place. If your business depends heavily on in-person presence, you may declare incidents based on weather events, global pandemic, and more.

Because you can be flexible in how you define incidents and their severity while still being consistent in how they’re handled, organizations with great incident management practices will build valuable muscle for identifying and clearly defining critical events of all types, then leverage their incident-related systems and processes to develop organizational resilience.

Visibility, observability, detection, and mitigation in cybersecurity

2 minute read

The concepts of visibility, observability, detection, and mitigation are foundational to cybersecurity–security architecture and detection engineering in particular–and technology operations in general. They’re useful for communicating at almost every level, within technical teams but also to organizational peers and leadership.

I’ve found myself thinking about and explaining the relationship between them time and again, specifically as a model for how we enable and mature security operations.

alt

Visibility

Visibility is knowing that an asset exists.

Of these terms, visibility is the lowest common denominator. For assets that are visible and that we’re responsible for protecting, we then want to consider gaining observability.

Observability

Observability indicates that we’re collecting some useful combination of activity, state, or outputs from a given system.

Practically speaking, it means that we’re collecting system, network, application, and other relevant logs, and ideally storing them in a central location such as a log aggregator or SIEM.

We can’t detect what we can’t see, so observability is a prerequisite for detection. And while it may not be required for all mitigations (e.g., you may purchase a point solution that mitigates a particular condition but provides no observability), there’s a strong argument to be made that observability is foundational to any ongoing security operations function, of which mitigation activities are a key part.

Lastly, observability as a metric may be the most important of all. As a security leader, I can think of few metrics that are more useful for gauging whether assets or attack surface are manageable–simply knowing about an asset means little if you have no insight into where, how, and by whom it’s being used.

Read more in-depth thoughts on observability here.

Detection

Detection is the act of applying analytics or intelligence to observable events for the purpose of identifying things of interest. In the context of cybersecurity, we detect threats.

Detection happens in a number of ways (not an exhaustive list, and note that there are absolutely relationships between these):

  • Turnkey or out-of-the-box detection, typically in the form of an alert from a security product
  • Detection engineering, a process where the output is analytics aimed to make detection repeatable, measurable, and scalable
  • Threat hunting, a process where the output is investigative leads that must be run to ground (and should feed detection engineering)
  • More . . .

There are plenty of ways to measure detection. We have outcome-based measures like time to detect (MTTD), respond, and recover (together, the MTTRs). And we have internal or maturity measures, like detection coverage.

Mitigation

Mitigation is easiest to think of as a harm reduction measure taken in response to a detected threat.

Mitigations can be preventative, detective, or response measures. Mitigation in the form of early-stage prevention, such as patching a software vulnerability or wholesale blocking of undesirable data, is ideal but not always possible. There are plenty of cases where the best you can do is respond faster or otherwise contain damage.

In practice, mitigation is an exercise in looking at the things that happen leading up to and following threat detection:

  • Where did it come from?
  • What did it do?
  • What’s its next move?

And then figuring out how and at what points you’re able to disrupt the threat, in whole or in part.