Re: Pressing the power button causes the device to freeze completely (schedutil involved)

From: Rafael J. Wysocki

Date: Wed Apr 29 2026 - 14:27:43 EST


On Tue, Apr 28, 2026 at 11:05 PM Evgeny Sagatov
<evgeny.sagatov@xxxxxxxxx> wrote:
>
> The PC also froze with this patch when I pressed the power button.

Which means that the issue is not simply a matter of the lack of
synchronization between different I/O space accesses, so most likely
the problem lies deeper.

Let me summarize what we've learned so far (and expand the CC list somewhat).

For those who have not seen the previous discussion, it is at:

https://lore.kernel.org/lkml/CAGAxtY2SEkx7OgMgM5ypA8qsBN0h6pcs111VjnhD-5ZGq7Je6Q@xxxxxxxxxxxxxx/#r

1. The issue is basically that the platform locks up completely when
the power button is pressed

This is reproducible 100% of the time.

The power button processing flow is that, if the power button event is
enabled in the ACPI PM1_ENABLE register, pressing the button causes
the corresponding status bit in the ACPI PM1_STATUS register to be
set, which in turn causes an ACPI interrupt (SCI) to trigger (both
PM1_STATUS and PM1_ENABLE registers are accessible through the I/O
address space).

The SCI processing involves reading both the power button status and
enable bits and clearing the former if set (this needs to be done or
an interrupt storm would start if the status bit was not cleared).

We don't actually know which of the steps above specifically causes
the platform to lock up, but that doesn't matter too much because all
of them are necessary.

Clearing the power button enable bit in PM1_ENABLE prevents the issue
from occurring (but then the power button obviously doesn't work).

2. The issue only occurs if the schedutil cpufreq governor is used

If either the "powersave" or "ondemand" governor is used instead, the
issue doesn't appear.

Also switching over to a different cpufreq governor (on all CPUs)
before pressing the power button makes the issue go away.

3. The issue didn't occur at all before commit e37617c8e53a
("sched/fair: Fix frequency selection for non-invariant case")

Reverting the merge that introduced commit e37617c8e53a into the
mainline makes the issue go away.

4. The issue occurs regardless of how schedutil invokes the cpufreq
driver ("fast switch" vs sugov_deferred_update())

5. The cpufreq driver in question is acpi-cpufreq and it uses a
control register located in the I/O address space (the same control
register is used for all CPUs)

6. If acpi-cpufreq is not allowed to write into the control register
at all, the issue doesn't occur

7. Synchronizing all of the I/O space accesses in the ACPI-related
code (via a spinlock), including acpi-cpufreq, doesn't prevent the
issue from occurring

So overall, it looks like the control register access pattern in
acpi-cpufreq that results from schedutil's use of it after commit
e37617c8e53a somehow puts the platform into a state in which a power
button event causes it to lock up at the hardware level.

While there are a couple of things more to check, I'm afraid that
there may not be a viable way to address this issue other than
replacing the schedutil governor with ondemand on this platform.