Re: power-off delay/hang due to commit 6d25be57 (mainline)

From: Rafael J. Wysocki
Date: Thu Dec 31 2020 - 15:47:45 EST


On Wednesday, December 2, 2020 8:13:38 PM CET Rafael J. Wysocki wrote:
> On Wed, Dec 2, 2020 at 7:31 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> >
> > On Wed, Dec 2, 2020 at 7:03 PM Sebastian Andrzej Siewior
> > <bigeasy@xxxxxxxxxxxxx> wrote:
> > >
> > > On 2020-10-26 18:20:59 [+0100], To Rafael J. Wysocki wrote:
> > > > > > > > Done as Bug 208877.
> > > > > > Rafael, do you have any suggestions?
> > > > >
> > > > > I've lost track of this sorry.
> > > > >
> > > > > I have ideas, let me get back to this next week.
> > > >
> > > > :)
> > >
> > > Rafael, any update? If you outline an idea or so then I may be able to
> > > form a patch out of it. Otherwise I have no idea how to fix this - other
> > > than telling the driver to not poll in smaller intervals than
> > > 30secs.
> >
> > The idea, roughly speaking, is to limit the number of outstanding work
> > items in the queue (basically, if there's a notification occurring
> > before the previous one can be handled, there is no need to queue up
> > another work item for it).
>
> That's easier said than done, though, because of the way the work item
> queue-up is hooked up into the ACPICA code.

So scratch this and it wouldn't work in general anyway AFAICS.

ATM, I'm tempted to do something like the patch below (with the rationale
that it shouldn't be necessary to read the temperature right after updating
the trip points if polling is in use, because the next update through polling
will cause it to be read anyway and it will trigger trip point actions as
needed).

Stephen, can you give it a go, please?

---
drivers/acpi/thermal.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

Index: linux-pm/drivers/acpi/thermal.c
===================================================================
--- linux-pm.orig/drivers/acpi/thermal.c
+++ linux-pm/drivers/acpi/thermal.c
@@ -911,24 +911,25 @@ static void acpi_thermal_notify(struct a
switch (event) {
case ACPI_THERMAL_NOTIFY_TEMPERATURE:
acpi_thermal_check(tz);
- break;
+ return;
case ACPI_THERMAL_NOTIFY_THRESHOLDS:
acpi_thermal_trips_update(tz, ACPI_TRIPS_REFRESH_THRESHOLDS);
- acpi_thermal_check(tz);
- acpi_bus_generate_netlink_event(device->pnp.device_class,
- dev_name(&device->dev), event, 0);
break;
case ACPI_THERMAL_NOTIFY_DEVICES:
acpi_thermal_trips_update(tz, ACPI_TRIPS_REFRESH_DEVICES);
- acpi_thermal_check(tz);
- acpi_bus_generate_netlink_event(device->pnp.device_class,
- dev_name(&device->dev), event, 0);
break;
default:
ACPI_DEBUG_PRINT((ACPI_DB_INFO,
"Unsupported event [0x%x]\n", event));
- break;
+ return;
}
+
+ /* Trigger an update of the thermal zone unless polling is in use. */
+ if (!tz->polling_frequency)
+ acpi_thermal_check(tz);
+
+ acpi_bus_generate_netlink_event(device->pnp.device_class,
+ dev_name(&device->dev), event, 0);
}

/*