[PATCH v4 0/2] Avoid spurious freezer wakeups

From: Elliot Berman
Date: Fri Sep 08 2023 - 18:51:01 EST


After commit f5d39b020809 ("freezer,sched: Rewrite core freezer logic"),
tasks that transition directly from TASK_FREEZABLE to TASK_FROZEN are
always woken up on the thaw path. Prior to that commit, tasks could ask
freezer to consider them "frozen enough" via freezer_do_not_count(). The
commit replaced freezer_do_not_count() with a TASK_FREEZABLE state which
allows freezer to immediately mark the task as TASK_FROZEN without
waking up the task. This is efficient for the suspend path, but on the
thaw path, the task is always woken up even if the task didn't need to
wake up and goes back to its TASK_(UN)INTERRUPTIBLE state. Although
these tasks are capable of handling of the wakeup, we can observe a
power/perf impact from the extra wakeup.

We observed on Android many tasks wait in the TASK_FREEZABLE state
(particularly due to many of them being binder clients). We observed
nearly 4x the number of tasks and a corresponding linear increase in
latency and power consumption when thawing the system. The latency
increased from ~15ms to ~50ms.

Avoid the spurious wakeups by saving the state of TASK_FREEZABLE tasks.
If the task was running before entering TASK_FROZEN state
(__refrigerator()) or if the task received a wake up for the saved
state, then the task is woken on thaw. saved_state from PREEMPT_RT locks
can be re-used because freezer would not stomp on the rtlock wait flow:
TASK_RTLOCK_WAIT isn't considered freezable.

For testing purposes, I use these commands can help see how many tasks were
woken during thawing:

1. Setup:
mkdir /sys/kernel/tracing/instances/freezer
cd /sys/kernel/tracing/instances/freezer
echo 0 > tracing_on ; echo > trace
echo power:suspend_resume > set_event
echo 'enable_event:sched:sched_wakeup if action == "thaw_processes" && start == 1' > events/power/suspend_resume/trigger
echo 'traceoff if action == "thaw_processes" && start == 0' > events/power/suspend_resume/trigger
echo 1 > tracing_on

2. Let kernel go to suspend

3. After kernel's back up:
cat /sys/kernel/tracing/instances/freezer/trace | grep sched_wakeup | grep -o "pid=[0-9]*" | sort -u | wc -l

Signed-off-by: Elliot Berman <quic_eberman@xxxxxxxxxxx>
---
Changes in v4:
- Split into 2 commits
- Link to v3: https://lore.kernel.org/r/20230908-avoid-spurious-freezer-wakeups-v3-1-d49821fda04d@xxxxxxxxxxx

Changes in v3:
- Remove #ifdeferry for saved_state in kernel/sched/core.c
- Link to v2: https://lore.kernel.org/r/20230830-avoid-spurious-freezer-wakeups-v2-1-8877245cdbdc@xxxxxxxxxxx

Changes in v2:
- Rebase to v6.5 which includes commit 1c06918788e8 ("sched: Consider task_struct::saved_state in wait_task_inactive()")
This allows us to remove the special jobctl handling on thaw.
- Link to v1: https://lore.kernel.org/r/20230828-avoid-spurious-freezer-wakeups-v1-1-8be8cf761472@xxxxxxxxxxx

---
Elliot Berman (2):
sched/core: Remove ifdeffery for saved_state
freezer,sched: Use saved_state to reduce some spurious wakeups

include/linux/sched.h | 2 --
kernel/freezer.c | 41 +++++++++++++++++++----------------------
kernel/sched/core.c | 40 +++++++++++++++++-----------------------
3 files changed, 36 insertions(+), 47 deletions(-)
---
base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
change-id: 20230817-avoid-spurious-freezer-wakeups-9f8619680b3a

Best regards,
--
Elliot Berman <quic_eberman@xxxxxxxxxxx>