RE: [PATCH 2/2] EDAC/ghes: Modularize ghes_edac driver to remove the dependency on ghes
From: Justin He
Date: Tue Aug 16 2022 - 02:58:22 EST
Hi Borislav & Toshi
Please see my questions below:
> -----Original Message-----
> From: Borislav Petkov <bp@xxxxxxxxx>
> Sent: Friday, August 12, 2022 10:46 PM
> To: Justin He <Justin.He@xxxxxxx>
> Cc: Ard Biesheuvel <ardb@xxxxxxxxxx>; Len Brown <lenb@xxxxxxxxxx>; James
> Morse <James.Morse@xxxxxxx>; Tony Luck <tony.luck@xxxxxxxxx>; Mauro
> Carvalho Chehab <mchehab@xxxxxxxxxx>; Robert Richter <rric@xxxxxxxxxx>;
> Robert Moore <robert.moore@xxxxxxxxx>; linux-acpi@xxxxxxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; linux-edac@xxxxxxxxxxxxxxx; devel@xxxxxxxxxx;
> Rafael J . Wysocki <rafael@xxxxxxxxxx>; Shuai Xue
> <xueshuai@xxxxxxxxxxxxxxxxx>; Jarkko Sakkinen <jarkko@xxxxxxxxxx>;
> linux-efi@xxxxxxxxxxxxxxx; nd <nd@xxxxxxx>; toshi.kani@xxxxxxx;
> stable@xxxxxxxxxx
> Subject: Re: [PATCH 2/2] EDAC/ghes: Modularize ghes_edac driver to remove
> the dependency on ghes
>
> On Thu, Aug 11, 2022 at 09:17:13AM +0000, Jia He wrote:
> > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index
> > d91ad378c00d..ed6519f3d45b 100644
> > --- a/drivers/acpi/apei/ghes.c
> > +++ b/drivers/acpi/apei/ghes.c
> > @@ -94,6 +94,8 @@
> > #define FIX_APEI_GHES_SDEI_CRITICAL __end_of_fixed_addresses
> > #endif
> >
> > +ATOMIC_NOTIFIER_HEAD(ghes_report_chain);
>
> static. You need function wrappers which call the notifier from the module.
>
> Also, why atomic? x86_mce_decoder_chain is a blocking one.
>
> Also, the whole notifier adding thing needs to be a separate patch.
>
> > @@ -1497,3 +1504,37 @@ void __init acpi_ghes_init(void)
> > else
> > pr_info(GHES_PFX "Failed to enable APEI firmware first mode.\n");
> > }
> > +
> > +/*
> > + * Known x86 systems that prefer GHES error reporting:
> > + */
> > +static struct acpi_platform_list plat_list[] = {
> > + {"HPE ", "Server ", 0, ACPI_SIG_FADT, all_versions},
> > + { } /* End */
> > +};
> > +
> > +struct list_head *ghes_get_devices(bool force) {
> > + int idx = -1;
> > +
> > + if (IS_ENABLED(CONFIG_X86)) {
> > + idx = acpi_match_platform_list(plat_list);
> > + if (idx < 0 && !force)
> > + return NULL;
> > + }
> > +
> > + return &ghes_devs;
> > +}
> > +EXPORT_SYMBOL_GPL(ghes_get_devices);
>
> And yes, as Toshi points out, the other EDAC drivers - sb_edac, skx_edac and
> amd64_edac - should call this function in their init functions so that it can get
> selected which driver to load on HPE server platforms.
>
I assume that all those edac drivers which used owner checking are
impacted, right? So the impacted list should be:
drivers/edac/pnd2_edac.c:1532: if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
drivers/edac/sb_edac.c:3513: if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
drivers/edac/amd64_edac.c:4333: if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
drivers/edac/i10nm_base.c:552: if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
drivers/edac/skx_base.c:657: if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
drivers/edac/igen6_edac.c:1275: if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
Furthermore, should I totally remove the owner check in above driver
XX_init()? Because after the new helper ghes_get_device() checking, those
drivers can't be initialized after ghes_edac is loaded. If ghes_edac is NOT
loaded, the edac_mc_owner check in edac_mc_add_mc_with_groups() can guarantee
that there is only one regular edac driver.
What do you think of it?
--
Cheers,
Justin (Jia He)