Re: [PATCH] EDAC/altera: Warm Reset option for Stratix10 peripheral DBE

From: Thor Thayer
Date: Tue Jun 04 2019 - 17:54:03 EST


Hi Sudeep,

On 6/4/19 12:38 PM, Sudeep Holla wrote:
On Tue, Jun 04, 2019 at 06:23:15PM +0100, James Morse wrote:
Hi Thor,

(CC: +Mark, Lorenzo and Sudeep for PSCI.
How should SYSTEM_RESET2 be used for a vendor-specific reset?


Initially it was indented to be used by passing command line argument
"reboot=w" or "reboot=warm" as specified in kernel document[1]

However it was enhanced and enabled specifically for panic by
Commit b287a25a7148 ("panic/reboot: allow specifying reboot_mode for panic only")

IIUC you can now pass "reboot=panic_warm" to just set reboot_mode to
WARM when there's a panic. SYSTEM_RESET2 gets called whenever reboot_mode
is set to WARM/SOFT

Thanks. I missed that SYSTEM_RESET2 had been implemented.

The original patch is:
lore.kernel.org/r/1559594269-10077-1-git-send-email-thor.thayer@xxxxxxxxxxxxxxx
)

On 03/06/2019 21:37, thor.thayer@xxxxxxxxxxxxxxx wrote:
From: Thor Thayer <thor.thayer@xxxxxxxxxxxxxxx>

The Stratix10 peripheral FIFO memories can recover from double
bit errors with a warm reset instead of a cold reset.
Add the option of a warm reset for peripheral (USB, Ethernet)
memories.

CPU memories such as SDRAM and OCRAM require a cold reset for
DBEs.
Filter on whether the error is a SDRAM/OCRAM or a peripheral
FIFO memory to determine which reset to use when the warm
reset option is configured.

... so you want to make different SMC calls on each CPU after panic()?


diff --git a/drivers/edac/altera_edac.c b/drivers/edac/altera_edac.c
index 8816f74a22b4..179601f14b48 100644
--- a/drivers/edac/altera_edac.c
+++ b/drivers/edac/altera_edac.c
@@ -2036,6 +2036,19 @@ static const struct irq_domain_ops a10_eccmgr_ic_ops = {
/* panic routine issues reboot on non-zero panic_timeout */
extern int panic_timeout;

+#ifdef CONFIG_EDAC_ALTERA_ARM64_WARM_RESET
+/* EL3 SMC call to setup CPUs for warm reset */
+void panic_smp_self_stop(void)
+{
+ struct arm_smccc_res result;
+
+ __cpu_disable();
+ cpu_relax();
+ arm_smccc_smc(INTEL_SIP_SMC_ECC_DBE, S10_WARM_RESET_WFI_FLAG,
+ S10_WARM_RESET_WFI_FLAG, 0, 0, 0, 0, 0, &result);

Please use SYSTEM_RESET2 or let us know why it can't be used to understand
the requirement better. There are options to use vendor extentions with
the SYSTEM_RESET2 PSCI command if you really have to. However the mainline
supports only architectural warm reset.

I need to decide between warm reset and cold reset based on the peripheral type but maybe that decision can be done by firmware as James pointed out.

Thanks for the links and the comments!

Thor
--
Regards,
Sudeep

[1] Documentation/admin-guide/kernel-parameters.txt