RE: [PATCH] RAS: Add a tracepoint for reporting memory controllerevents

From: Luck, Tony
Date: Wed May 30 2012 - 19:24:43 EST


> u32 grain; /* granularity of reported error in bytes */
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

>> dimm->grain = nr_pages << PAGE_SHIFT;

I'm not at all sure what we'll see digging into the chipset registers
like EDAC does - but we do have different granularity when reporting
via machine check banks. That's why we have this code:

/*
* Mask the reported address by the reported granularity.
*/
if (mce_ser && (m->status & MCI_STATUS_MISCV)) {
u8 shift = MCI_MISC_ADDR_LSB(m->misc);
m->addr >>= shift;
m->addr <<= shift;
}

in mce_read_aux(). In practice right now I think that many errors will
report with cache line granularity, while a few (IIRC patrol scrub) will
report with page (4K) granularity. Linux doesn't really care - they all
have to get rounded up to page size because we can't take away just one
cache line from a process.

> @Tony: Can you ensure us that, on Intel memory controllers, the address
> mask remains constant at module's lifetime, or are there any events that
> may change it (memory hot-plug, mirror mode changes, interleaving
> reconfiguration, ...)?

I could see different controllers (or even different channels) having
different setup if you have a system with different size/speed/#ranks
DIMMs ... most systems today allow almost arbitrary mix & match, and the
BIOS will decide which interleave modes are possible based on what it
finds in the slots. Mirroring imposes more constraints, so you will
see less crazy options. Hot plug for Linux reduces to just the hot add
case (as we still don't have a good way to remove DIMM sized chunks of
memory) ... so I don't see any clever reconfiguration possibilities
there (when you add memory, all the existing memory had better stay
where it is, preserving contents). Perhaps the only option where things
might change radically is socket migration ... where the constraint is
only that the target of the migration have >= memory of the source. So
you might move from some weird configuration with mixed DIMM sizes and
thus no interleave, to a homogeneous socket with matched DIMMs and full
interleave. But from an EDAC level, this is a new controller on a new
socket ... not a changed configuration on an existing socket.

-Tony

N‹§²æìr¸›yúèšØb²X¬¶ÇvØ^–)Þ{.nÇ+‰·¥Š{±‘êçzX§¶›¡Ü}©ž²ÆzÚ&j:+v‰¨¾«‘êçzZ+€Ê+zf£¢·hšˆ§~†­†Ûiÿûàz¹®w¥¢¸?™¨è­Ú&¢)ßf”ù^jÇy§m…á@A«a¶Úÿ 0¶ìh®å’i