Re: kexec on rk3399

From: Robin Murphy
Date: Wed Aug 14 2019 - 09:12:31 EST


On 14/08/2019 13:53, Vicente Bergas wrote:
On Monday, July 22, 2019 4:31:27 PM CEST, Vicente Bergas wrote:
Hi, i have been running linux on rk3399 booted with kexec fine until 5.2
From 5.2 onwards, there are memory corruption issues as reported here:
http://lkml.iu.edu/hypermail/linux/kernel/1906.2/07211.html
kexec has been identified as the principal reason for the issues.

It turns out that kexec has never worked reliably on this platform,
i was just lucky until recently.

Please, can you provide some directions on how to debug the issue?

Thank you all for your suggestions on where the issue could be.

It seems that it was the USB driver.
Now using v5.2.8 booted with kexec from v5.2.8 with a workaround and
so far so good. It is being tested on the Sapphire board.

The workaround is:
--- a/drivers/usb/dwc3/dwc3-of-simple.c
+++ b/drivers/usb/dwc3/dwc3-of-simple.c
@@ -133,6 +133,13 @@
ÂÂÂÂreturn 0;
}

+static void dwc3_of_simple_shutdown(struct platform_device *pdev)
+{
+ÂÂÂ struct dwc3_of_simple *simple = platform_get_drvdata(pdev);
+
+ÂÂÂ reset_control_assert(simple->resets);
+}
+
static int __maybe_unused dwc3_of_simple_runtime_suspend(struct device *dev)
{
ÂÂÂÂstruct dwc3_of_simpleÂÂÂ *simple = dev_get_drvdata(dev);
@@ -190,6 +197,7 @@
static struct platform_driver dwc3_of_simple_driver = {
ÂÂÂÂ.probeÂÂÂÂÂÂÂ = dwc3_of_simple_probe,
ÂÂÂÂ.removeÂÂÂÂÂÂÂ = dwc3_of_simple_remove,
+ÂÂÂ .shutdownÂÂÂ = dwc3_of_simple_shutdown,
ÂÂÂÂ.driverÂÂÂÂÂÂÂ = {
ÂÂÂÂÂÂÂ .nameÂÂÂ = "dwc3-of-simple",
ÂÂÂÂÂÂÂ .of_match_table = of_dwc3_simple_match,

If this patch is OK after review i can resubmit it as a pull request.
Should a similar change be applied to drivers/usb/dwc3/core.c ?

This particular change looks like it's implicitly specific to RK3399, which wouldn't be ideal. Presumably if the core dwc3 driver implemented shutdown correctly (echoing parts of dwc3_remove(), I guess) then the glue layers shouldn't need anything special anyway.

Robin.