Don’t Let Vendor Exuberance Distract from the Value of the MITRE ATT&CK Evaluation

Josh Zelonis
The Recovering Analyst
4 min readApr 23, 2021

--

The MITRE ATT&CK Evaluation is published as a scientific data set to enable you to interpret information about specific products such as their product strategy and efficacy. A consequence of this is that it also opens the possibility for marketers to run wild with claims of having “won” the evaluation.

Two and a half years ago, MITRE published the first Enterprise ATT&CK Evaluation to what felt like a quiet reception. The market was in desperate need of a fair and transparent efficacy evaluation, and frankly, I was looking for one too. As an analyst, I was responsible for EDR Wave evaluations to help companies with vendor selection, but was frustrated by my inability to perform efficacy testing as part of my Waves.

And then I opened Pandora’s Box…

In my initial blog on the ATT&CK Evaluation, I outlined a scoring system for understanding vendor performance based on the scales I frequently used in Forrester Waves, while publishing accompanying source code to allow people to easily create their own scales and weightings. Within minutes, there were graphics made and Twitter was abuzz. I had created a monster by giving marketers the ability to market the evaluation, which I hope has in some way contributed to the continued adoption and success of the project… in spite of everything that goes with it.

Understanding Detection Strategy and Efficacy Using the MITRE ATT&CK Evaluation

Since that initial attempt at generating a “simple score,” I’ve spent the last few years preaching that there are two critical metrics for understanding product strategy with regard to detection, visibility and analytics (I’ve used variations of what I’ve called these terms, but the spirit has always been the same):

Visibility is a high-water mark for detection as it relates to collection. It’s a measurement of the telemetry that is generated which may be hunted against and is available to perform analytics on to generate automated detections.

The Analytics score is a measure of the automated detection capability of the product, specifically, how well it’s able to accurately identify “behaviors” abstracted from the raw telemetry or log data it’s been sent.

This year is the first time that MITRE has released these metrics, but there’s a critical difference between the way I’ve presented these metrics in the past and how they are calculating them that I’d like to address.

Orthogonality Is Critical if You Want A Deep Dive Understanding of Analytic Capabilities

Orthogonality is a fancy math term which essentially means that two data sets don’t have dependencies on each other. One of my primary concerns in evaluating these products is to understand not just how well they did in the evaluation, but what the vendor strategy with regard to the product is. To have a truly orthogonal data set, the analytics score must be the ratio of analytics detections divided by visibility, not the ratio of analytics detections divided by total substeps. By having these two metrics be orthogonal, we can start to think about detection in terms of a function for the slope of a line (y = mx) such that a product team can understand how their product is breaking down in the scoring and specifically target improving analytics capabilities or visibility to maximize the return on investment of their development dollars. Similarly, an IR team may focus on solutions with high visibility without as much need for analytics because they already know “bad” has happened and are actively looking for it, while an enterprise organization may prioritize analytics detections to feed their SOC workflows.

Interestingly, because visibility and analytics aren’t orthogonal to each other the way that MITRE has calculated them, analytics becomes a rough approximation for the overall detection capability of the product as a single metric. Rule of thumb: if you’re orthogonal, you need both; otherwise just use analytics to understand automated detection.

This Year MITRE Started Shifting Left

With the evaluation this year, roughly half the vendors also participated in a separate part of the evaluation that was focused on protection capabilities. This is an important evolution in the evaluation as you’re not just buying an EDR solution, but a combined EDR+EPP solution when investing in these products. Detection is only half the story.

Here’s The Big Picture On How To Think About These Metrics

When an attack initially occurs, you may think of it as both unmitigated and unknown. Your first opportunity to deflect it comes from your protection capabilities which have the ability to both mitigate and notify you of the attack. If these protection capabilities fail to mitigate the attack, then your second opportunity comes as an analytic detection within your environment, where you will be alerted and the attack may be thought of as unmitigated, but known. Finally, if an attack makes it through your automated detection capabilities as well, you’ve entered into a period of “dwell time” where an unmitigated, unknown attack has occurred and your ability to perform threat hunting to identify the attack is what will differentiate you from an internal versus an external detection data point in the next M-Trends report. As discussed above, while there is nuance to saying that visibility as a metric is limited to threat hunting, this is generally how you’ll experience it and any vendor that’s prioritizing visibility in their messaging isn’t talking about the detection capabilities that are going to keep you out of that dreaded dwell time.

A representation of the attack lifecycle showing that protection, detection, and finally threat hunting are the order defensive technologies work to protect your organization.
How various metrics contribute to stopping the attack lifecycle.

The real value to this evaluation is the winners and losers are left for you to decide. I hope this guide is helpful in illustrating how to prioritize different metrics that are being referenced in the market, which in turn helps you select the best solution for your organization. Each year I’ve published source code for generating an analysis of these products, and making it easier for you to customize your own analysis. You may find this year’s code base here.

--

--

Josh Zelonis
The Recovering Analyst

Josh Zelonis is a Director of Security Strategy for Palo Alto Networks, a former Forrester analyst and cybersecurity tech founder.