Re: Bug report: kernel paniced when system hibernates
From: Atish Patra
Date: Thu May 25 2023 - 13:40:14 EST
On Thu, May 25, 2023 at 7:21 AM Conor Dooley <conor.dooley@xxxxxxxxxxxxx> wrote:
>
> On Thu, May 25, 2023 at 07:29:46PM +0530, Anup Patel wrote:
> > On Thu, May 25, 2023 at 7:26 PM Conor Dooley <conor.dooley@xxxxxxxxxxxxx> wrote:
> > >
> > > On Thu, May 25, 2023 at 07:13:11PM +0530, Anup Patel wrote:
> > > > On Thu, May 25, 2023 at 7:08 PM Conor Dooley <conor.dooley@xxxxxxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, May 25, 2023 at 06:51:28PM +0530, Anup Patel wrote:
> > > > >
> > > > > > > We should only rely on this node name for known bad versions of opensbi
> > > > > > > IMO. Going forward, if something needs to be reserved for firmware, the
> > > > > > > firmware should make sure that it is reserved by using the property for
> > > > > > > that purpose :)
> > > > >
> > > > > > There is no issue with OpenSBI since it does the right thing by marking
> > > > > > memory as reserved in the DT. This real issue is with the kernel handling
> > > > > > of reserved memory for hibernate.
> > > > >
> > > > > I don't think we are talking about the same thing here. I meant the
> > > > > no-map property which OpenSBI does not set.
> > > >
> > > > Yes, we are talking about the same thing. It's not just OpenSBI not
> > > > setting no-map property in reserved memory node because other
> > > > SBI implementations would be doing the same thing (i.e. not setting
> > > > no-map property)
> > >
> > > Other SBI implementations doing the same thing doesn't make it any more
> > > correct though, right?
> >
> > Like multiple folks suggested, we need DT binding for distinguishing
> > firmware reserved memory from other reserved memory.
>
> And I have agreed with multiple times!
>
> > Until that
> > happens we should either mark hibernate support as experimental
> > or revert it.
>
> That works for me. How about the below?
>
Instead of disabling hibernate support why not revert the patch
3335068 ("riscv: Use PUD/P4D/PGD pages for the linear mapping")
which doesn't add any "measured" value at this point.
However, keeping the hibernation feature on and disabling linear
mapping will get more
testing on hibernation. While disabling hibernation and keeping the above patch
which doesn't have any value at all.
We don't have a regression at this point. So either approach will work though.
If we choose to go this route, some thoughts about the commit message.
> -- >8 --
> From 1d4381290a1600eff9b29b8ace6be73955d9726c Mon Sep 17 00:00:00 2001
> From: Conor Dooley <conor.dooley@xxxxxxxxxxxxx>
> Date: Thu, 25 May 2023 15:09:08 +0100
> Subject: [PATCH] RISC-V: mark hibernation as broken
>
> Hibernation support depends on firmware marking its reserved
> regions as not mappable by Linux. As things stand, the de-facto SBI
either not mappable or no save/restore capable (as We still have not
concluded which way we want to go in)
> implementation (OpenSBI) does not do this, and other implementations may
> not do so either, resulting in kernel panics during hibernation ([1],
> [2]).
>
we should probably add more context in the commit message.
How about adding something along these lines:
As things stand, the latest version of de-facto SBI
implementation(OpenSBI) doesn't
do this any more to allow 1G huge page mappings by kernel. Other SBI
implementations are probably
doing the same. Until the commit 3335068 ("riscv: Use PUD/P4D/PGD
pages for the linear mapping"),
the first 2MB region of DRAM (where the typically firmware resides)
was not mappable by kernel. However,
enabling that mapping resulted in the kernel panics during hibernation
([1], [2]) as the hibernation process
tries to save/restore any mapped region even though it is marked as reserved.
> Disable support for hibernation until such time that an SBI
> implementation independent way to communicate what regions are reserved
> has been agreed upon.
>
Anybody who wants to test the hibernation feature must revert the
above mentioned patch along with turning on
the config.
> Reported-by: Song Shuai <suagrfillet@xxxxxxxxx>
> Link: https://lore.kernel.org/all/CAAYs2=gQvkhTeioMmqRDVGjdtNF_vhB+vm_1dHJxPNi75YDQ_Q@xxxxxxxxxxxxxx/ [1]
> Reported-by: JeeHeng Sia <jeeheng.sia@xxxxxxxxxxxxxxxx>
> Link: https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/ITXwaKfA6z8
> Signed-off-by: Conor Dooley <conor.dooley@xxxxxxxxxxxxx>
> ---
> arch/riscv/Kconfig | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 13f058490608..b2495192f35a 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -801,7 +801,7 @@ menu "Power management options"
> source "kernel/power/Kconfig"
>
> config ARCH_HIBERNATION_POSSIBLE
> - def_bool y
> + def_bool n
>
> config ARCH_HIBERNATION_HEADER
> def_bool HIBERNATION
> --
> 2.39.2
>
--
Regards,
Atish