Re: [PATCH v1] PM: sleep: core: Synchronize runtime PM status of parents and children
From: Rafael J. Wysocki
Date: Fri Feb 07 2025 - 10:43:34 EST
On Fri, Feb 7, 2025 at 3:45 PM Johan Hovold <johan@xxxxxxxxxx> wrote:
>
> On Fri, Feb 07, 2025 at 02:50:29PM +0100, Johan Hovold wrote:
>
> > Yeah, I hit something like this yesterday as well and did confirm that
> > reverting this commit makes the problem go away. Haven't had time to dig
> > much further.
> >
> > [ 110.522368] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
>
> > [ 110.855238] Call trace:
> > [ 110.857861] simple_pm_bus_runtime_suspend+0x14/0x48 (P)
> > [ 110.863425] pm_generic_runtime_suspend+0x2c/0x44
> > [ 110.868362] pm_runtime_force_suspend+0x54/0x100
> > [ 110.873217] dpm_run_callback+0xb4/0x228
> > [ 110.877347] device_suspend_noirq+0x70/0x2a8
> > [ 110.881844] dpm_noirq_suspend_devices+0xe0/0x230
> > [ 110.886778] dpm_suspend_noirq+0x24/0x98
> > [ 110.890904] suspend_devices_and_enter+0x368/0x678
> > [ 110.895941] pm_suspend+0x1b4/0x348
> > [ 110.899627] state_store+0x8c/0xfc
> > [ 110.903228] kobj_attr_store+0x18/0x2c
> > [ 110.907195] sysfs_kf_write+0x4c/0x78
> > [ 110.911074] kernfs_fop_write_iter+0x120/0x1b4
> > [ 110.915735] vfs_write+0x2ac/0x358
> > [ 110.919352] ksys_write+0x68/0xfc
> > [ 110.922873] __arm64_sys_write+0x1c/0x28
> > [ 110.927002] invoke_syscall+0x48/0x110
> > [ 110.930969] el0_svc_common.constprop.0+0x40/0xe0
> > [ 110.935907] do_el0_svc+0x1c/0x28
> > [ 110.939427] el0_svc+0x48/0x114
> > [ 110.942769] el0t_64_sync_handler+0xc8/0xcc
> > [ 110.947180] el0t_64_sync+0x198/0x19c
> > [ 110.951059] Code: a9be7bfd 910003fd a90153f3 f9403c00 (f9400014)
> > [ 110.957428] ---[ end trace 0000000000000000 ]---
>
> Ok, so the driver data is never set and runtime PM is never enabled for
> this simple bus device, which uses pm_runtime_force_suspend() for system
> sleep.
This is kind of confusing. Why use pm_runtime_force_suspend() if
runtime PM is never enabled and cannot really work?
> This used to work as the runtime PM state was left at 'suspended', which
> makes pm_runtime_force_suspend() return early, but now we can end up
> with a call to the driver runtime PM ops that dereference the NULL
> driver data.
Thanks for the info!
pm_runtime_force_suspend() is a known weak point, but I had assumed
that it wouldn't be involved in dependency chains starting at devices
with DPM_FLAG_SMART_SUSPEND set.
Well, more work ...