[PATCH v2 01/24] EDAC, mc: Fix grain_bits calculation

From: Robert Richter
Date: Mon Jun 24 2019 - 11:09:10 EST


The grain in edac is defined as "minimum granularity for an error
report, in bytes". The following calculation of the grain_bits in
edac_mc is wrong:

grain_bits = fls_long(e->grain) + 1;

Where grain_bits is defined as:

grain = 1 << grain_bits

Example:

grain = 8 # 64 bit (8 bytes)
grain_bits = fls_long(8) + 1
grain_bits = 4 + 1 = 5

grain = 1 << grain_bits
grain = 1 << 5 = 32

Replacing it with the correct calculation:

grain_bits = fls_long(e->grain - 1);

The example gives now:

grain_bits = fls_long(8 - 1)
grain_bits = fls_long(8 - 1)
grain_bits = 3

grain = 1 << 3 = 8

Note: We need to check if the hardware reports a reasonable grain != 0
and fallback with a warn_once and 1 byte granularity otherwise.

Signed-off-by: Robert Richter <rrichter@xxxxxxxxxxx>
---
drivers/edac/edac_mc.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 64922c8fa7e3..45cac74ab833 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -1235,9 +1235,15 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
if (p > e->location)
*(p - 1) = '\0';

- /* Report the error via the trace interface */
- grain_bits = fls_long(e->grain) + 1;
+ /*
+ * We expect the hw to report a reasonable grain, fallback to
+ * 1 byte granularity otherwise.
+ */
+ if (WARN_ON_ONCE(!e->grain))
+ e->grain = 1;
+ grain_bits = fls_long(e->grain - 1);

+ /* Report the error via the trace interface */
if (IS_ENABLED(CONFIG_RAS))
trace_mc_event(type, e->msg, e->label, e->error_count,
mci->mc_idx, e->top_layer, e->mid_layer,
--
2.20.1