Inside the IBM FlashSystem Realtime Ransomware Detection

Background

As the market for cybersecurity products explodes, we’re also beginning to see innovative approaches to detecting threats built into the infrastructure. A storage array, for example, should be able to detect data corruption. After all, storage is where your data lives. The challenge, however, is that a storage array often needs more context to know whether data is good or bad. This is because a storage device simply stores data, it doesn’t try to interpret that data.

IBM accepted the challenge of building ransomware threat detection directly into its FlashSystem storage solutions. The company is taking an innovative approach that promises to detect ransomware and other corrupt data as its written to disk, all without needing to understand the contents of that data. IBM is building this functionality directly into its storage devices. And IBM tells us that there’s more coming soon.

News: Detecting Anomalies Using Data Behavior Analysis

Three basic approaches are used to detect ransomware: detect it on the network, detect it using file signatures on servers and PCs, or detect it by looking at the behavior of the data itself—most threat detection solutions on the market focus on the first two. The challenge with signature-based detection is that, for it to be useful, you must know the signature of every piece of malware. So it’s an on-going arms race.

IBM’s new anomaly detection capabilities focus on the behavior of the data itself without requiring specific knowledge of malware signatures. IBM uses a mathematical technique known as Shannon Entropy to detect highly random data, such as the encrypted data often used by malware.

IBM scans data as it arrives inside the FlashSystem, calculating the Shannon Entropy of the data as it enters the write cache. The system alerts the storage administrator through IBM Storage Insights if an anomaly is detected. After that, it’s up to the storage administrator to take corrective action. IBM Storage Insights allows the administrator to configure alert thresholds to fit any given environment.

IBM’s Computational Storage Approach

Scanning data in realtime is computationally expensive. So to keep its ransomware detection from impacting system performance, IBM only samples 1% of data writes. That’s statistically significant enough to detect anomalies, but it could be better.

IBM’s announcement of its new FlashSystem anomaly detection said that, sometime towards the end of the year, these capabilities will extend down to IBM’s “computational storage flash drives—FlashCore Modules—to bring detection as close to the data as possible, further reducing time to detection.” So this is something IBM has been thinking about for a while.

Last year when IBM introduced its latest third-generation FlashCore Module, I talked to IBM Fellow and FlashSystem CTO Andy Walls about the technology. Andy describes FlashCore as a computational storage device. This means that a FlashCore Module combines NAND flash, DRAM and MRAM for caching, and an astonishing amount of compute to deliver greater functionality than a traditional SSD could. It also takes some of the computationally heavy work, such as compression, from the storage array and performs that work on the drive itself. As a result, this is a very efficient and flexible architecture.

The computational part of this storage is the ARM processor cores built into a flexible and reprogrammable onboard FPGA. The primary purpose of this logic today is to manage the module’s QLC flash. Beyond making QLC enterprise-ready, the first generation of FlashCore also focused on compression. This is one of the most critical attributes of any enterprise storage array. Compression influences efficiency and cost, critical concerns for anyone in IT.

Andy said that while IBM started with compression and continues refining that capability, the goal is to offload and accelerate the storage applications where they make sense. It also promises to enable new and exciting capabilities we haven’t yet seen in a storage array.

Looking forward, a FlashCore Module could help with the problem of managing unstructured data, performing filtering, searching, and scanning at the media level.

In addition, Andy told me the processor could potentially be used to deliver realtime statistics about entropy changes of the data stored on the drive itself. This is exactly what IBM is hinting may come later this year.

Analysis

While IBM is the only storage provider building this level of realtime anomaly detection directly into its storage arrays,

Data protection and cyber-resiliency features are quickly becoming standard offerings within enterprise storage solutions. Immutable snapshots, for example, are now part of solutions from nearly every top-tier storage vendor. Scanning snapshots using entropy calculations similar to those used by IBM for anomalies is also becoming popular.

The challenge with the approaches taken by most storage solutions is that detection often happens after the fact. Once a snapshot is found to be corrupt, it may be too late. You have to work backward in time to find good data to restore. IBM’s approach closes that gap, alerting the user nearly instantly when an anomaly is detected. This happens long before the corrupted data is written to a snapshot. That’s precisely the kind of protection you want in your enterprise.

I continue to be impressed by the innovation from IBM’s storage group. The company continues to offer what is perhaps the most advanced and innovative storage technology in the industry. The new capabilities of detecting ransomware by analyzing entropy in realtime at the volume level only adds to that innovation.

Disclosure: The author is an industry analyst, and NAND Research an industry analyst firm, that engages in, or has engaged in, research, analysis, and advisory services with many technology companies, which may include those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.