Why Probability of Detection (PoD) is a Flawed Metric for Evaluating Continuous Monitoring Systems
By: Alex MacGregor
TL;DR:
PoD Origin and Use: Probability of Detection curves are an accepted metric used to evaluate the detection capabilities of methane measurement technologies, measuring the likelihood of detecting leaks during controlled releases.
Controlled Releases versus Real leaks: Controlled release tests are typically short duration whereas leaks are persistent therefore PoD curves developed from controlled release tests do not accurately reflect real-world leak scenarios.
PoD and Continuous Monitoring: For continuous monitoring systems, the minimum detection limit (MDL) is equivalent to its 90% PoD for persistent leaks.
Industry Needs Better Metrics for Evaluating CM: Industry should develop and adopt new metrics for evaluating continuous monitoring systems such as observability.
Regulatory Alignment: The EPA's OOOOb rule correctly emphasizes minimum detection limit (MDL) over PoD for continuous monitors, focusing on measuring leak severity relative to background methane levels over time.
I first encountered probability of detection (PoD) curves as a metric for evaluating methane detection capabilities in 2018 through Arvind Ravikumar's paper "Good vs Good Enough." This paper assessed the effectiveness of optical gas imaging (OGI) cameras in real-world conditions, using PoD curves to quantify performance. While reading the paper, I started thinking about how PoD would apply to continuous emissions monitors and had a difficult time wrapping my head around why it is the industry standard for evaluating the detection capabilities of continuous monitoring systems.
For those who aren’t aware, a PoD curve measures a technology’s ability to detect methane releases of different rates, represented as a probability of a successful detection for a given observation. It is developed by performing controlled methane releases at various rates. Successes and failures in detection are recorded, and a logistic regression is applied to generate the PoD curve, characterizing the likelihood of successfully observing a leak at a given rate. Figure 1 shows an example of a probability of detection curve, the black dots on 0 and 1 lines show unsuccessful and successful releases respectively while the blue curve shows the probability of detecting a release at a given rate.
The PoD metric makes a lot of sense for characterizing the performance of intermittent methane measurement technologies. It is especially important to understand a technology’s detection capabilities if the technology is only making a single observation over a short duration on an infrequent basis. For example, consider a company operating under regulations requiring four surveys per year. If flares are unlit 3-5% of the time, and each survey occurs over a single day, the probability of observing an unlit flare during a quarterly survey is roughly 0.054%[1]. Hence, a high PoD is critical for intermittent technologies, as infrequent data collection increases the chance of missing leaks.
Applying PoD and using it to characterize a continuous monitoring systems detection performance is more challenging and likely misleading for the following reason: controlled releases used to develop PoD curves are fundamentally different than real leaks. Leaks start at a point in time and persist until they are detected and fixed. Controlled releases usually have a predefined start time and duration as it is impractical to run long-duration controlled release tests that mimic the nature of leaks. This isn’t a problem for intermittent technologies where a single observation is made and is considered either successful or not. For continuous monitoring systems, the PoD metric doesn’t properly capture the key benefit of the technology which is to provide temporal coverage.
Qube has participated in numerous studies characterizing the technologies probability of detection and one thing has remained consistent across all of these studies; our PoD has never converged to a single curve or value for a 90% PoD. Figure 2 shows an example of this, the three probability of detection curves were developed as part of a 3rd party blind test, where methane was released from a point source with rates ranging from 0.1-1.38 kg/h. Each release lasted for 40 minutes and concentric rings of Qube’s devices were placed 50, 75 and 100 meters from the release point.
As you can see from the figure, we developed three probability of detection curves which were created from the same set of releases which is confusing since PoD is meant to be a statistical measure of a technologies capabilities detection capabilities. It also begs the question, what is Qube’s probability of detection?
While I can’t answer this question by looking at the curves above, the question does have an answer, which is that for a long enough leak, a continuous monitoring system’s PoD is equal to its minimum detection limit. Given that for an average facility we make about ~25 million measurements per year, it would make sense that the PoD of each measurement becomes less important.
The probability of detection curve shown above also implies that we will detect leaks of greater than 2.5 kg/h 100% of the time. I understand that many emissions are not persistent some are short one-off events (compressor blowdowns), some are intermittent but reoccur (when we deploy our technology we typically see the largest emissions reductions from these types of events) and others are persistent leaks. However, regardless of the nature of the emission, the PoD curve implies that as long as the emission size is greater than the technologies 100% PoD we will always measure it and this isn’t the case. We will only measure what the system observes and our algorithm infers and as such the metric is misleading.
For shorter emissions, determining a CM system’s PoD is more complicated and requires an understanding of wind variability and observability. Observability of a particular leak source refers to whether there is a monitoring device downwind of that leak source at a particular moment in time. We know that not every molecule of gas above MDL will be detected at the fenceline, but we can still infer the general start and end times and whether an event is ongoing as long as there is periodic observability during the event. You can think of a continuous monitoring system as a panning security camera, in the same way that a security camera may miss an event when it pans the continuous monitoring system may miss portions of emissions when the wind changes directions. However, wind is highly variable and the GIF in Figure 3 shows an example of this wind variability:
Observability and wind variability, ie. whether the security camera is pointed in the right direction how fast it pans back and forth are what determine how well a CM system can bound an event and quantify. A smart CM system should be able to go back and correct detected emission rates across the full duration of the event, even during periods of low observability (see Beyond Site Rate blog post). However, the longer and more frequent the plume crosses a device, the more confidence we have in its quantified rate and bounds.
Continuous emission monitoring technology has evolved significantly in the past few years but the metrics to evaluate the technology have not evolved as quickly. While probability of detection does have some utility as an industry we need better metrics to evaluate continuous monitoring technology specifically ones which adequately consider the temporal coverage that these systems provide. In our view a metric which looks at observability and wind variability is needed, but given that there isn’t a generally accepted metric for these, detection time and MDL are significantly better ways of evaluating continuous monitoring systems (I’ll talk about the why in my next blog post).
As regulations continue to evolve, its important that regulators have an informed view of what performance metrics mean for different styles of measurement. In my opinion, the EPA got it right with the OOOOb rule for continuous monitors. The rule requires a minimum detection limit of 0.4 kg/hr as opposed to a probability of detection. The focus is on measuring whether a leak is severe relative to the background the levels of methane at a particular facility. Putting the new OOOOb rule into context with short duration controlled releases, the 7-day action level for a non-wellhead only facility is 21 kg/h. How important is it that we can pick out and separate a 20-minute-long trial release at 2.5 kg/h when we are quantifying the 21 kg/h over the full 7-day rolling average period? To put it in perspective, that trial release would amount to only 0.83 kg vs the 3,528 kg required for the 7-day action level.
[1] Inefficient and unlit natural gas flares both emit large quantities of methane by Plant, Brandt, Fordice, Negron, Schwietzke, Smith, Zavala-Araiza