Re: DRAM unreliable under specific access patern

From: Mark Seaborn
Date: Sun Dec 28 2014 - 17:48:55 EST


On 24 December 2014 at 15:41, Pavel Machek <pavel@xxxxxx> wrote:
> > Try this test program: https://github.com/mseaborn/rowhammer-test
> >
> > It has reproduced bit flips on various machines.
...
> So we have a program that corrupts basically random memory on many
> machines. That is not good. That means that unpriviledged user can
> crash processes of other users.
...
> We could make DRAM refresh faster. That will incur performance
> penalty (<10%?), and is probably chipset-specific...?

Some machines already double the DRAM refresh rate in some cases.

For example, a presentation from Intel says:

"When non-pTRR compliant DIMMs are used, the E5-2600 v2 system
defaults into double refresh mode, which has longer memory
latency/DIMM access latency and can lower memory bandwidth by up to
2-4%.
...
* DDR3 DIMMs are affected by a pass gate charge migration issue (also
known as Row Hammer) that may result in a memory error.
* The Pseudo Target Row Refresh (pTRR) feature introduced on Ivy
Bridge processor families (2S/4S E5 v2, E7 v2) helps mitigate the
DDR3 pass gate issue by automatically refreshing victim rows."

-- from http://infobazy.gda.pl/2014/pliki/prezentacje/d2s2e4-Kaczmarski-Optymalna.pdf
("Thoughts on Intel Xeon E5-2600 v2 Product Family Performance
Optimisation â component selection guidelines", August 2014, Marcin
Kaczmarski)

Note that Target Row Refresh (TRR) is a DRAM feature that was added to
the recently-published LPDDR4 standard (where "LP" = "Low Power").
See http://www.jedec.org/standards-documents/results/jesd209-4
(registration is required to download the spec, but it's free). TRR
is basically a request that the CPU's memory controller can send to a
DRAM module to ask it to refresh a row's neighbours. I am not sure
how Pseudo TRR differs from TRR, though.

That presentation mentions one CPU (or CPU family), but I don't know
which other CPUs support these features (i.e. doubling the refresh
rate and/or using pTRR). Even if a CPU supports these features, it is
difficult to determine whether a machine's BIOS enables them. It is
the BIOS's responsibility to configure the CPU's memory controller at
startup.

Also, it is not clear how much doubling the DRAM refresh rate would
help prevent rowhammer-induced bit flips. Yoongu Kim et al's paper
shows that, for some DRAM modules, a refresh period of 32ms (instead
of the usual 64ms) is not short enough to reduce the error rate to
zero. See Figure 4 in
http://users.ece.cmu.edu/~yoonguk/papers/kim-isca14.pdf. I expect
that doubling the refresh rate is useful for reliability, but not
necessarily security. It would prevent accidental bit flips caused by
accidental row hammering, where programs accidentally generate a lot
of cache misses without using CLFLUSH. But it might not prevent a
determined attacker from generating bit flips that might be used for
taking control of a system.

Cheers,
Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/