Multicloud, and multiSIEM

tl;dr

  • Many companies of even modest size are multicloud
  • Every major cloud platform and security vendor has a SIEM or is building one
  • Because of this, an increasing percentage of companies are now “multiSIEM”
  • MultiSIEM is more of a situation we find ourselves in than a strategy
  • All SIEM-alike solutions—from narrower log aggregators like XDR, to traditional SIEMs, to newer data lake solutions that better balance scale, cost, and analytical capabilities—are all on a collision course

We are all multicloud

Early on, it used to be the case that we talked about multicloud like “build the same app and deploy it to multiple providers.” In practice, virtually no one does this. The reality is more like:

  • We use Google Workspace for email and productivity, and AWS for production.
  • We use Microsoft 365 for email and productivity, and Google Cloud for production.
  • We use AWS WorkMail for . . . kidding, no one uses the AWS productivity stuff outside of AWS
  • Some variant of the above

And an obviously common pattern specific to Microsoft is the Hybrid Cloud, which is a mix of Microsoft cloud and on-premise services.

We aren’t reimplementing the same solution across multiple cloud providers. We’re using the right cloud provider for the right job.

A consequence of this multicloud pattern for cybersecurity practitioners is that we’re often dealing with some overlapping data types (e.g., identity) across cloud providers, but then we’re also dealing with many unique, service-specific data types.

Multicloud? MultiSIEM.

Crude SIEM landscape Above graphic is illustrative, not comprehensive.

A good number of organizations already identified a need to collect data from a wide variety of systems, and so they’ve adopted one of the handful of pure-play SIEM solutions:

  • Splunk
  • QRadar
  • Elastic Security
  • LogRhythm

So, that’s one SIEM.

Then, over time, every major cloud platform provider (and, most major enterprise technology providers) established the pattern of building or acquiring a SIEM, sending all of their native data sources into it, and of course supporting just enough third-party services to make it appealing to some pure-play SIEM customers:

For many organizations, that’s at least two SIEM solutions.

If you already have a pure play SIEM + Microsoft 365 + AWS for production, you might have three SIEMs now, which makes you a true collector.

This is to say nothing of the various log aggregators, like Amazon’s CloudTrail, Security Lake, and numerous other SIEM-adjacent external security services, Google’s Cloud Logging for basic log aggregation, and other solutions like them. These aren’t SIEM per se, but are yet another consideration when it comes to data flows, and thus security investigation workflows.

The cats and dogs of infrastructure

For organizations with on-premise, remote access, or data center infrastructure, they also get to figure out what to do with the massive amount of data generated by these technologies:

  • Palo Alto’s XSIAM
  • Cisco’s Security Analytics and Logging
  • Fortinet’s FortiSIEM

At the end of the day, the unluckiest of organizations have a few or more SIEM-like solutions, depending on how liberally you count.

Don’t forget about XDR

Then, we have XDR platforms, most of which are EDR or EPP solutions that have expanded the scope of data that they can consume, and so “endpoint” is no longer a useful distinction (thus, “extended”).

And this is to say nothing of MDR and other service providers, some of which have their own platform that functions as a SIEM. In these cases, it’s common for selected features to be exposed only to the vendor’s team, while a subset of features are exposed to customers.

From an evolutionary standpoint, XDR platforms are most directly on a collision course with SIEM, which tracks with major platform providers’ roadmaps.

Security platforms and log aggregators on a collision course

Most organizations large, mature, or crazy enough to own a traditional SIEM now have at least two, perhaps more solutions that provide increasingly overlapping functionality. And they have them not because it’s their desire of strategy, but because these various points of aggregation exist and oftentimes the provider’s logs end up in them by default.

There’s a logical desire to collect defensive telemetry and other log data in the fewest number of places, to optimize not only for cost but also for our attention, as we then need to build analytics, perform investigations, and respond to threats. But there’s also a cost to getting data from one place to another, and as we push for more observability, it can quickly run counter to consolidation (e.g., moving all of your Amazon logs to Microsoft or vice versa can incur massive transit costs).

It’s hard to see how this can consolidate in the traditional sense, and the most likely path is that expansive enterprise security teams, including outsourced security operations providers, become adept at minimizing overlap amongst data flows (read: transit and storage costs) and analytics. We’ll figure out how to aggregate the bulk of a given provider’s logs in their native log aggregator, and siphon off a subset to data lakes and/or SIEM platforms based on a combination of cost factors and operational use cases.

Much like we’ve learned to live with multicloud, we’ll learn how to optimize and live with multiSIEM.

Breaches are more like plane crashes than car crashes

The Swiss Cheese Model

This was taken verbatim from LinkedIn post that I wrote in a fit of near-rage after seeing my 1000th vendor claim that human error causes most breaches. There’s more to say about this, and a more constructive way to say it, but perhaps another time.

Saying “most breaches are due to human error” is tired, ill-informed, unhelpful framing. If true, only on a technicality, and wildly misleading.

People make mistakes every day. Very, very few of the mistakes people make that are associated with security incidents are rare, unknown, or exceptional failure modes. They’re mostly things we know and expect to happen, but that we haven’t taken enough care to prevent (including not adding sufficient friction, where we can’t prevent the thing outright).

Got ransomware? Was the initial access vector email, specifically phishing? Was the root cause the user opening the email and falling for the lure? No way. Never. The root cause was the ability to run arbitrary code or software on a system, instead of using application control (mostly free, but admittedly free like a puppy, not like beer). Or maybe the root cause was lack of MFA or a lesser MFA implementation that isn’t sufficiently phishing resistant.

Breaches are more like plane crashes than car crashes: They don’t happen in an instant. A number of things have to go wrong, and a number of opportunities to avoid or lessen the impact have to be missed.

Implying that one of these things is primarily to blame mischaracterizes the problem, and shows a general lack both understanding of how breaches occur, and of basic systems thinking.

A simple framework for talking to leadership about a cybersecurity program, particularly for an inbound security leader looking to level-set:

  1. Calibrate (ideally visually) to the state of the program today: Existing level of investment, key controls and functions, etc.
  2. Highlight prevailing risks, and your best estimate of the cost associated with each, if realized.
  3. Propose changes: Investments you want to make, to address risks in #2, leading to . . .
  4. Provide a projected future state of the program: What would the state of the program look like if investments from #3 are made? How would prevailing risks change?

It’s oversimplified, and that’s intentional. You can add depth or show work in any area, if it’s attainable. But for a lot of organizations, this is still a lot of work and an acceptable level of rigor.

August 29, 2024

Where do your incidents come from?

How do you identify your highest severity incidents in the first place? Which data sources or products deliver the investigative leads? And which are critical to detection and response? These questions speak directly to the value of data sources, products, and functions.

What are the prevailing root causes? This speaks directly to the quality of incident management, particularly post-incident review.

I can’t stress enough the importance of being able to answer these questions as the leader of any operational team (cybersecurity, technology, or otherwise).

August 28, 2024

SaaS attack technique matrix Permalink

Inspired by MITRE ATT&CK, the good folks at Push Security have taken a pass at enumerating attack techniques specific to Software-as-a-Service (SaaS) applications.

ATT&CK is a great and robust framework, and I love to see it adapted to capture techniques and tactics for different types of systems.

Push Security SaaS Attacks example