Re: [PATCH v2 4/7] ghes_edac: avoid multiple calls to dmi_walk()

From: Kani, Toshimitsu
Date: Fri Aug 04 2017 - 17:02:45 EST


On Fri, 2017-08-04 at 06:05 +0200, Borislav Petkov wrote:
> On Thu, Aug 03, 2017 at 03:57:50PM -0600, Toshi Kani wrote:
> > ghes_edac_register() is called for each GHES platform device
> > instantiated per a GHES entry in ACPI HEST table.ÂÂdmi_walk()
> > counts the number of DIMMs on the system, and there is no need
> > to call it multiple times.
> >
> > Change ghes_edac_register() to call dmi_walk() only when
> > 'num_dimm' is uninitialized.
> >
> > Signed-off-by: Toshi Kani <toshi.kani@xxxxxxx>
> > Suggested-by: Borislav Petkov <bp@xxxxxxxxx>
> > Cc: Borislav Petkov <bp@xxxxxxxxx>
> > Cc: Mauro Carvalho Chehab <mchehab@xxxxxxxxxx>
> > ---
> > Âdrivers/edac/ghes_edac.c |ÂÂÂÂ9 ++++++---
> > Â1 file changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> > index 4e61a62..2e9ce9c 100644
> > --- a/drivers/edac/ghes_edac.c
> > +++ b/drivers/edac/ghes_edac.c
> > @@ -407,15 +407,18 @@
> > EXPORT_SYMBOL_GPL(ghes_edac_report_mem_error);
> > Â
> > Âint ghes_edac_register(struct ghes *ghes, struct device *dev)
> > Â{
> > - bool fake = false;
> > - int rc, num_dimm = 0;
> > Â struct mem_ctl_info *mci;
> > Â struct edac_mc_layer layers[1];
> > Â struct ghes_edac_pvt *pvt;
> > Â struct ghes_edac_dimm_fill dimm_fill;
> > + int rc;
> > +
> > + static int num_dimm;
> > + static bool fake;
> > Â
> > Â /* Get the number of DIMMs */
> > - dmi_walk(ghes_edac_count_dimms, &num_dimm);
> > + if (num_dimm == 0)
> > + dmi_walk(ghes_edac_count_dimms, &num_dimm);
>
> So the problem is that ghes_edac_register() gets called multiple
> times depending on how many GHES platform devices are on the system.
> But yet they all scan *all* DIMMs. So instead you should return if
> the DIMMs have been counted already and not register a second time.
>
> Which makes that whole mc counting kinda useless. So you could rip
> that out too.
>
> Unless I'm missing something...

GHES platform devices correspond to GHES entries, which define firmware
interfaces to report generic memory errors to the OS, such as NMI and
SCI. These devices are associated with all DIMMs, not a particular
DIMM.

Thanks,
-Toshi