Re: [PATCH v2 13/24] EDAC, ghes: Add support for legacy API counters
From: Robert Richter
Date: Fri Aug 30 2019 - 05:35:40 EST
On 16.08.19 11:55:59, Borislav Petkov wrote:
> On Mon, Jun 24, 2019 at 03:09:22PM +0000, Robert Richter wrote:
> > The ghes driver is not able yet to count legacy API counters in sysfs,
> > e.g.:
> >
> > /sys/devices/system/edac/mc/mc0/csrow2/ce_count
> > /sys/devices/system/edac/mc/mc0/csrow2/ch0_ce_count
> > /sys/devices/system/edac/mc/mc0/csrow2/ch1_ce_count
> >
> > Make counting csrows/channels generic so that the ghes driver can use
> > it too.
>
> What for?
Same was asked here:
https://lore.kernel.org/patchwork/patch/1080277/
Actually it is a fix for the counters exposed by the legacy API for
the ghes driver. Counters are broken (set to zero). The ghes driver is
the only where errors are reported using edac_raw_mc_handle_error()
instead of edac_mc_handle_error(). The fix is to move the error
counting to edac_mc_handle_error() where the other counters are
incremented.
All distributions that I have checked enable the legacy API option
(CONFIG_EDAC_LEGACY_SYSFS=y) and the interface cannot be disabled for
individual drivers. As long as the counters are exposed, their values
should be correct. See all options discussed in the thread from v1.
> ghes_edac enumerates the DIMMs from SMBIOS - it doesn't need chip
> selects and ranks. Those are used when you can't count the DIMMs
> properly...
Right, but that is true also for other drivers (actually all other
drivers since DIMMs are used now). It is to support older tools that
deal with */csrow*/ch* instead of */dimm* in sysfs.
-Robert