RE: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

From: Luck, Tony
Date: Fri May 30 2014 - 19:03:38 EST


>> For memory error location, I will utilize type offset to save one
>> more byte, furthermore, I want to drop requestor_id, responder_id
>> and target_id. 1) They are very rare (I've never seen them by now)
>
> My concern is, are we sure we're never going to need them at all? Tony,
> what's your take on this?

They may seem rare because our BIOS doesn't bother to provide them.
Other BIOS writers may be more diligent.

I flip-flop on the issue of how much detail to log. For the majority
of users it is enough to just point at the DIMM. That's the thing that
they can easily replace.

But OEMs and large scale users often want to know every tiny detail
so they can look for patterns between errors reported across a large
fleet. So I hate to drop information on the floor that might be useful
to someone later.

All of this stuff only applies to server systems - so quibbling over
a handful of *bytes* in an error record on a system that has tens,
hundreds or even thousands of *gigabytes* of memory seems
a bit pointless.

-Tony