[PATCH v2 10/13] EDAC/altera: Skip the panic notifier if kdump is loaded

From: Guilherme G. Piccoli
Date: Tue Jul 19 2022 - 15:58:33 EST


The altera_edac panic notifier performs some data collection with
regards errors detected; such code relies in the regmap layer to
perform reads/writes, so the code is abstracted and there is some
risk level to execute that, since the panic path runs in atomic
context, with interrupts/preemption and secondary CPUs disabled.

Users want the information collected in this panic notifier though,
so in order to balance the risk/benefit, let's skip the altera panic
notifier if kdump is loaded. While at it, remove a useless header
and encompass a macro inside the sole ifdef block it is used.

Cc: Dinh Nguyen <dinguyen@xxxxxxxxxx>
Cc: Petr Mladek <pmladek@xxxxxxxx>
Cc: Tony Luck <tony.luck@xxxxxxxxx>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx>

---

V2:
- new patch, based on the discussion in [0].
[0] https://lore.kernel.org/lkml/62a63fc2-346f-f375-043a-fa21385279df@xxxxxxxxxx/

drivers/edac/altera_edac.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/edac/altera_edac.c b/drivers/edac/altera_edac.c
index e7e8e624a436..741fe5539154 100644
--- a/drivers/edac/altera_edac.c
+++ b/drivers/edac/altera_edac.c
@@ -16,7 +16,6 @@
#include <linux/kernel.h>
#include <linux/mfd/altera-sysmgr.h>
#include <linux/mfd/syscon.h>
-#include <linux/notifier.h>
#include <linux/of_address.h>
#include <linux/of_irq.h>
#include <linux/of_platform.h>
@@ -24,6 +23,7 @@
#include <linux/platform_device.h>
#include <linux/regmap.h>
#include <linux/types.h>
+#include <linux/kexec.h>
#include <linux/uaccess.h>

#include "altera_edac.h"
@@ -2063,22 +2063,30 @@ static const struct irq_domain_ops a10_eccmgr_ic_ops = {
};

/************** Stratix 10 EDAC Double Bit Error Handler ************/
-#define to_a10edac(p, m) container_of(p, struct altr_arria10_edac, m)
-
#ifdef CONFIG_64BIT
/* panic routine issues reboot on non-zero panic_timeout */
extern int panic_timeout;

+#define to_a10edac(p, m) container_of(p, struct altr_arria10_edac, m)
+
/*
* The double bit error is handled through SError which is fatal. This is
* called as a panic notifier to printout ECC error info as part of the panic.
+ *
+ * Notice that if kdump is set, we take the risk avoidance approach and
+ * skip the notifier, given that users are expected to have access to a
+ * full vmcore.
*/
static int s10_edac_dberr_handler(struct notifier_block *this,
unsigned long event, void *ptr)
{
- struct altr_arria10_edac *edac = to_a10edac(this, panic_notifier);
+ struct altr_arria10_edac *edac;
int err_addr, dberror;

+ if (kexec_crash_loaded())
+ return NOTIFY_DONE;
+
+ edac = to_a10edac(this, panic_notifier);
regmap_read(edac->ecc_mgr_map, S10_SYSMGR_ECC_INTSTAT_DERR_OFST,
&dberror);
regmap_write(edac->ecc_mgr_map, S10_SYSMGR_UE_VAL_OFST, dberror);
--
2.37.1