Re: [PATCH 2/5] thermal/core: Reset cooling state during cooling device unregistration

From: Rafael J. Wysocki
Date: Tue Mar 28 2023 - 13:55:15 EST


On Tue, Mar 28, 2023 at 4:46 AM Zhang, Rui <rui.zhang@xxxxxxxxx> wrote:
>
> On Mon, 2023-03-27 at 17:13 +0200, Rafael J. Wysocki wrote:
> > On Mon, Mar 27, 2023 at 4:50 PM Zhang, Rui <rui.zhang@xxxxxxxxx>
> > wrote:
> > > On Fri, 2023-03-24 at 14:19 +0100, Rafael J. Wysocki wrote:
> > > > On Fri, Mar 24, 2023 at 8:08 AM Zhang Rui <rui.zhang@xxxxxxxxx>
> > > > wrote:
> > > > > When unregistering a cooling device, it is possible that the
> > > > > cooling
> > > > > device has been activated. And once the cooling device is
> > > > > unregistered,
> > > > > no one will deactivate it anymore.
> > > > >
> > > > > Reset cooling state during cooling device unregistration.
> > > > >
> > > > > Signed-off-by: Zhang Rui <rui.zhang@xxxxxxxxx>
> > > > > ---
> > > > > In theory, this problem that this patch fixes can be triggered
> > > > > on a
> > > > > platform with ACPI Active cooling, by
> > > > > 1. overheat the system to trigger ACPI active cooling
> > > > > 2. unload ACPI fan driver
> > > > > 3. check if the fan is still spinning
> > > > > But I don't have such a system so I didn't trigger then problem
> > > > > and
> > > > > I
> > > > > only did build & boot test.
> > > >
> > > > So I'm not sure if this change is actually safe.
> > > >
> > > > In the example above, the system will still need the fan to spin
> > > > after
> > > > the ACPI fan driver is unloaded in order to cool down, won't it?
> > >
> > > Then we can argue that the ACPI fan driver should not be unloaded
> > > in
> > > this case.
> >
> > I don't think that whether or not the driver is expected to be
> > unloaded at a given time has any bearing on how it should behave when
> > actually unloaded.
> >
> > Leaving the cooling device in its current state is "safe" from the
> > thermal control perspective, but it may affect the general user
> > experience (which may include performance too) going forward, so
> > there
> > is a tradeoff.
>
> Right.
> If we don't have a third choice, then the question is simple.
> "thermal safety" vs. "user experience"?
>
> I'd vote for "thermal safety" and drop this patch series.

Works for me.

> > What do the other cooling device drivers do in general when they get
> > removed?
>
> No cooling device driver has extra handling after cdev unregistration.

However, the question regarding what to do when the driver of a
cooling device in use is being removed is a valid one.

One possible approach that comes to mind could be to defer the driver
removal until the overheat condition goes away, but anyway it would be
better to do that in the core IMV.