RE: [PATCH] RAS: Add a tracepoint for reporting memory controllerevents

From: Luck, Tony
Date: Thu May 31 2012 - 13:02:23 EST


>> u8 shift = MCI_MISC_ADDR_LSB(m->misc);
>> m->addr >>= shift;
>> m->addr <<= shift;
>
> That's 64 bytes max, IIRC.

No, it's a 6-bit field used as a shift ... so if it has value "6", it means
cache line granularity. Value "12" would mean 4K granularity. Architecturally
it could say "30" to mean gigabyte, or even "63" to mean "everything is gone".

>> while a few (IIRC patrol scrub) will report with page (4K)
>> granularity. Linux doesn't really care - they all have to get rounded
>> up to page size because we can't take away just one cache line from a
>> process.
>
> I'd like to see that :-)

Patrol scrub works inside the depths of the memory controller on rank/row
addresses, not on system physical addresses. When it finds a problem, a
reverse translation is needed to be able to report a system physical
address in MCi_ADDR. Getting all the bits right is apparently a hard thing
to do, so the MCI_MISC_ADDR_LSB bits are used to indicate that some low
order bits are not valid.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/