Re: [PATCH 1/1] EDAC/ghes: Fix for NULL pointer dereference in ghes_edac_register()

From: Borislav Petkov
Date: Wed Aug 26 2020 - 04:52:37 EST


On Tue, Aug 25, 2020 at 02:01:08PM +0100, Shiju Jose wrote:
> After the 'commit b9cae27728d1 ("EDAC/ghes: Scan the system once on driver init")'
> applied, following error has occurred in ghes_edac_register() when
> CONFIG_DEBUG_TEST_DRIVER_REMOVE is enabled. The null ghes_hw.dimms pointer
> in the mci_for_each_dimm() of ghes_edac_register() caused the error.
>
> The error occurs when all the previously initialized ghes instances are
> removed and then probe a new ghes instance. In this case, the ghes_refcount
> would be 0, ghes_hw.dimms and mci already freed. The ghes_hw.dimms would
> be null because ghes_scan_system() would not call enumerate_dimms() again.

Try the below instead and see if it fixes the issue for you too.

If it does, pls send it as v2 but do not add the splat to the commit
message - that's a lot of noise for something which is clear why it
happens and you explain it properly in text anyway.

Thx.

---
diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index da60c29468a7..54ebc8afc6b1 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -55,6 +55,8 @@ static DEFINE_SPINLOCK(ghes_lock);
static bool __read_mostly force_load;
module_param(force_load, bool, 0);

+static bool system_scanned;
+
/* Memory Device - Type 17 of SMBIOS spec */
struct memdev_dmi_entry {
u8 type;
@@ -225,14 +227,12 @@ static void enumerate_dimms(const struct dmi_header *dh, void *arg)

static void ghes_scan_system(void)
{
- static bool scanned;
-
- if (scanned)
+ if (system_scanned)
return;

dmi_walk(enumerate_dimms, &ghes_hw);

- scanned = true;
+ system_scanned = true;
}

void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
@@ -631,6 +631,8 @@ void ghes_edac_unregister(struct ghes *ghes)

mutex_lock(&ghes_reg_mutex);

+ system_scanned = false;
+
if (!refcount_dec_and_test(&ghes_refcount))
goto unlock;


--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette