NVIDIA Warns Its High-End GPUs May Be Vulnerable to Rowhammer Attacks
Slashdot reader BrianFagioli shared this report from Nerds.xyz:
NVIDIA just put out a new security notice, and if you're running one of its powerful GPUs, you might want to pay attention. Researchers from the University of Toronto have shown that Rowhammer attacks, which are already known to affect regular DRAM, can now target GDDR6 memory on NVIDIA's high-end GPUs when ECC [error correction code] is not enabled.
They pulled this off using an A6000 card, and it worked because system-level ECC was turned off. Once it was switched on, the attack no longer worked. That tells you everything you need to know. ECC matters.
Rowhammer has been around for years. It's one of those weird memory bugs where repeatedly accessing one row in RAM can cause bits to flip in another row. Until now, this was mostly a CPU memory problem. But this research shows it can also be a GPU problem, and that should make data center admins and workstation users pause for a second.
NVIDIA is not sounding an alarm so much as reminding everyone that protections are already in place, but only if you're using the hardware properly. The company recommends enabling ECC if your GPU supports it. That includes cards in the Blackwell, Hopper, Ada, and Ampere lines, along with others used in DGX, HGX, and Jetson systems. It also includes popular workstation cards like the RTX A6000.
There's also built-in On-Die ECC in certain newer memory types like GDDR7 and HBM3. If you're lucky enough to be using a card that has it, you're automatically protected to some extent, because OD-ECC can't be turned off. It's always working in the background. But let's be real. A lot of people skip ECC because it can impact performance or because they're running a setup that doesn't make it obvious whether ECC is on or off. If you're not sure where you stand, it's time to check. NVIDIA suggests using tools like nvidia-smi or, if you're in a managed enterprise setup, working with your system's BMC or Redfish APIs to verify settings.
Read more of this story at Slashdot.