Optimizing Reliability: Step 3 – Focus on Root Cause Analysis by Eliminating False Positives

Author: Jon Herlocker


The previous steps in this blog series described how applying physics-based modeling using a digital twin can help you identify faulty sensors and reduce the number of new sensors you install. Now let’s take a look at how these same principles can be used to filter out the “noise” of receiving alerts that aren’t helpful.

One of the key goals of any asset management team is to look for root causes of systemic problems that can impact the performance, reliability, and safe utilization of the assets they oversee. What often gets in the way of this analysis is the proliferation of “false positive” alerts, where the monitoring system thinks it has detected a problem, but it’s either not a problem that has business impact, or it’s not actually a problem at all.

Finding efficient ways to monitor your systems means, in part, rooting out these false positives so you can stay focused on the system behaviors that matter most for your purposes.

Let’s return to the cooling tower in the Step 1 example. As we discussed, your goal is to monitor power consumption and make sure it stays within a reasonable operating range. Now let’s say you need to perform this monitoring across several cooling towers, located in different buildings. The towers at each building perform similar functions using similar systems, as shown below.

Equipping the buildings with sensors is one option for gathering data and analyzing for root cause. But sensor data is typically a barrage of numbers, only a few of which might be pertinent for your purposes. Machine learning and rules-based programming of your monitoring solution can help identify statistical anomalies in these numbers, but the resulting alarms and alerts are often irrelevant to the actual business conditions you’re monitoring for—in this case, variances in power consumption.

Using the same logic we applied in the first example, and repeating it across multiple facilities, can serve to identify these anomalies in a simple, meaningful way. Tracking the cooling tower approach (temperature differential between the water as it enters and the overall humidity/temperature results) in each tower and using physics to compute the power consumption at each one gives you a basis for singling out anomalous readings that are actually pertinent to your business needs. In this example, an atypical reading at building 3 indicates an issue.

Once again, this approach only works when you create a digital twin and use a physics-based approach to modeling potential error conditions in your environment. The advantages to doing so help simplify the monitoring process, filter out the noise of false positives, and gain a more durable, digital foundation for understanding and mapping the processes and priorities you care about day-to-day.