Re: [PATCH v2 6/7] drm/panfrost: Fix PM usage_count mishandling
From: Adrián Larumbe
Date: Tue Jun 16 2026 - 15:48:55 EST
On 05.06.2026 11:48, Steven Price wrote:
> On 04/06/2026 18:35, Adrián Larumbe wrote:
> > During device probe(), failure to do a PM get() will leave the usage_count
> > set to 0, which is the value assigned at device creation time. That means
> > when the autosuspend delay expires, runtime suspend callback won't be
> > invoked, so the device will remain powered on forever.
> >
> > On top of that, failure to call PM put() during device unplug means
> > Panfrost device's PM usage_count increases monotonically for every new
> > module reload.
> >
> > The combined outcome of both of the above was that devfreq OPP transition
> > notifications would be printed all the time, even when no jobs are being
> > submitted. This quickly fills the kernel ring buffer with junk.
> >
> > Even direr than that was the fact MMU interrupts are only enabled when
> > the device is reset, so after device probe() the very first job targeting
> > the tiler heap BO would always time out, because the driver's PM runtime
> > resume callback would not be invoked.
> >
> > Signed-off-by: Adrián Larumbe <adrian.larumbe@xxxxxxxxxxxxx>
> > Fixes: 635430797d3f ("drm/panfrost: Rework runtime PM initialization")
> > Fixes: 876b15d2c88d ("drm/panfrost: Fix module unload")
> > ---
> > drivers/gpu/drm/panfrost/panfrost_drv.c | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> > index 2d4b6aa95c66..545fbf2c8d0c 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> > @@ -989,6 +989,7 @@ static int panfrost_probe(struct platform_device *pdev)
> > pm_runtime_set_active(pfdev->base.dev);
> > pm_runtime_mark_last_busy(pfdev->base.dev);
> > pm_runtime_enable(pfdev->base.dev);
> > + pm_runtime_get_noresume(pfdev->base.dev);
> > pm_runtime_set_autosuspend_delay(pfdev->base.dev, 50); /* ~3 frames */
> > pm_runtime_use_autosuspend(pfdev->base.dev);
> >
> > @@ -1000,10 +1001,12 @@ static int panfrost_probe(struct platform_device *pdev)
> > if (err < 0)
> > goto err_out1;
> >
> > + pm_runtime_put_autosuspend(pfdev->base.dev);
> >
> > return 0;
> >
> > err_out1:
> > + pm_runtime_put_noidle(pfdev->base.dev);
> > pm_runtime_disable(pfdev->base.dev);
> > panfrost_device_fini(pfdev);
>
> Sashiko is concerned that dropping the usage count before
> pm_runtime_disable() could cause things to turn off too early, and I
> have to agree it sounds like it could be a problem:
>
> Sashiko wrote:
> > Does dropping the usage count before pm_runtime_disable() create a race
> > condition where the suspend callback can run and disable clocks before
> > hardware shutdown?
> > Because the usage count is dropped early, a concurrent PM event could trigger
> > the suspend callback, disabling clocks. Then, panfrost_device_fini() calls
> > panfrost_gpu_fini() which writes to MMIO registers. Could writing to
> > unclocked registers on ARM SoCs cause fatal bus errors or panics?
I think this could be an issue if the device were already registered and someone
could drive the pm resume and then suspend sequence through an ioctl, but because
this is an error path and yet the device was never made available, I can't imagine
how this could happen.
Maybe if the panfrost device had any children devices, and when one of them did a
put autosuspend, the refcnt would be propagdated back to Panfrost and then trigger
the scenario described by shashiko.
However, I've just realised it's alright to call pm_runtime_put_noidle() even after
pm_runtime_disable(). Seems that the latter just prevents any further suspends or
resumes on the PM device, but we're still in control of the refcnt, so moving
pm_runtime_put_noidle() right after panfrost_device_fini() should be fine.
> Sashiko also suggests we might have some other (partially pre-existing)
> issues here.
>
> https://sashiko.dev/#/patchset/20260604-claude-fixes-v2-0-57c6bd4c1655%40collabora.com
I'll look into all the pre-existing issues and write fixes for the next patch series revision.
> Thanks,
> Steve
>
> > pm_runtime_set_suspended(pfdev->base.dev);
> > @@ -1018,8 +1021,9 @@ static void panfrost_remove(struct platform_device *pdev)
> > drm_dev_unregister(&pfdev->base);
> >
> > pm_runtime_get_sync(pfdev->base.dev);
> > - pm_runtime_disable(pfdev->base.dev);
> > panfrost_device_fini(pfdev);
> > + pm_runtime_put_noidle(pfdev->base.dev);
> > + pm_runtime_disable(pfdev->base.dev);
> > pm_runtime_set_suspended(pfdev->base.dev);
> > }
> >
> >
Adrian Larumbe