Re: [PATCH 1/1] [RFC] wakeup_reason: Add infrastructure to log and report why the system resumed from suspend.

From: Lee Jones
Date: Fri Jan 28 2022 - 04:32:15 EST


Just a little surface review.

I'll let the big people (SMEs) do the technical bit.

On Fri, 28 Jan 2022, Zichar Zhang wrote:

> When optimizing for power usage, it's useful to be able to track and
> log each time why the system woke up from suspend. This is useful as
> it helps us understand why we might be seeing more wakeups then we
> expect. For a while now, Android has carried the "wakeup_reasons"
> patch which provides this to allow developers to optimize battery life
> on devices.
>
> This patch tries to provide a simplified version of the
> Android wakeup_reasons functionality. It tracks both software
> wakeup_sources as well as IRQS that brought the device out of suspend,
> and exposes these values as a string via /sys/power/last_wakeup_reason.
>
> This differs slightly from the Android patch, in that it doesn't track
> the suspend-abort reason logging, the misconfigured or unmapped IRQS,
> the threaded IRQS, and time spent in suspend/resume. But it provides
> an initial simple step to add a useful portion of the logic.
>
> Here is the requirements from https://lkml.org/lkml/2022/1/10/1134 for
> "wakeup_reason" with line head "[Y]" and "[N]" to notify if this commit
> meet the requirements or not. And if it is not, the detail reason is
> right behind it after "--".
>
> * wakeup IRQs, including:
> [Y]* multiple IRQs if more than one is pending during resume flow
> [N]* unmapped HW IRQs
> [N]* misconfigured IRQs
> [N]* threaded IRQs (not just the parent chip's IRQ)
> -- If we do these things, the additional codes should be added in
> interrupt subsystem or some interrupt controller drivers. These
> codes are no relationship with the interrupt subsystem or the
> interrupt controllers. And we can't distinguish these IRQs from
> "non-suspend" environment, even the "Android patch" can't do that.
> So it will make the things complicated.
> -- If these IRQs do hanpen, the code in this commit will catch
> them as "unknown wakeup reason" and suggest user to check the
> kernel log for more information.
> -- I think We should cooperate with interrupt subsystem to log
> these "abnormal" IRQs. And it could be several additional
> commits to accomplish this work together then.
>
> * non-IRQ wakeups, including:
> [Y]* wakeups caused by an IRQ that was consumed by lower-level SW
> [Y]* wakeups from SOC architecture that don't manifest as IRQs
>
> * abort reasons, including:
> [N]* wakeup_source activity
> [N]* failure to freeze userspace
> [N]* failure to suspend devices
> [N]* failed syscore_suspend callback
> -- These codes are "intrusive" to kernel (suspend/resume).
> -- These "errors" are already printed in kernel log, if we add
> these log to "wakeup_reason" either, it will cause double
> "log string" in section ".data", which just waste of the memory.
> -- As mentioned before, if these "abort action" happened, you
> can catch string "unknown wakeup reason", and check the kernel
> log then.
>
> * durations from the most recent cycle, including:
> [N]* time spent doing suspend/resume work
> [N]* time spent in suspend
> -- Just separate these from "wakeup reason". It should be done
> in another commit.
>
> This patch can be tested in Android using the following change to AOSP:
> https://android-review.googlesource.com/c/platform/system/hardware/interfaces/+/1958666
>
> And there is a stress test code for the interfaces in kernel:
> https://android-review.googlesource.com/c/kernel/common/+/1958189
>
> Any feedback or thoughts would be welcome!
>
> Signed-off-by: Zichar Zhang <zichar.zhang@xxxxxxxxxx>

You should probably tip your hat to the original authors at one
point.

> Change-Id: Id70f3cbec15f24ea999d7f643e9589be9c047f2b

No Androidness please.

> ---
> Documentation/ABI/testing/sysfs-power | 9 ++
> drivers/base/power/wakeup.c | 6 +
> include/linux/wakeup_reason.h | 35 +++++
> kernel/power/Makefile | 1 +
> kernel/power/main.c | 11 ++
> kernel/power/wakeup_reason.c | 183 ++++++++++++++++++++++++++
> 6 files changed, 245 insertions(+)
> create mode 100644 include/linux/wakeup_reason.h
> create mode 100644 kernel/power/wakeup_reason.c
>
> diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power
> index 90ec4987074b..e79d7a49a24e 100644
> --- a/Documentation/ABI/testing/sysfs-power
> +++ b/Documentation/ABI/testing/sysfs-power
> @@ -182,6 +182,15 @@ Description:
> to a sleep state if any wakeup events are reported after the
> write has returned.
>
> +What: /sys/power/last_wakeup_reason
> +Date: Jan 2022
> +Contact: Zichar Zhang <zichar.zhang@xxxxxxxxxx>

Are you pledging to be the official contact for the foreseeable
future? Is this okay with the original authors?

> +Description:
> + The /sys/power/last_wakeup_reason file shows the last reason
> + causing system "wake up" from suspend state. It could be an
> + interrupt signal, a kernel "wakeup_source" or just some other
> + reason logged into this file, and shows as a string.
> +
> What: /sys/power/reserved_size
> Date: May 2011
> Contact: Rafael J. Wysocki <rjw@xxxxxxxxxxxxx>
> diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> index 99bda0da23a8..a91ee1b8c0df 100644
> --- a/drivers/base/power/wakeup.c
> +++ b/drivers/base/power/wakeup.c
> @@ -15,6 +15,7 @@
> #include <linux/seq_file.h>
> #include <linux/debugfs.h>
> #include <linux/pm_wakeirq.h>
> +#include <linux/wakeup_reason.h>
> #include <trace/events/power.h>
>
> #include "power.h"
> @@ -924,6 +925,7 @@ bool pm_wakeup_pending(void)
>
> if (ret) {
> pm_pr_dbg("Wakeup pending, aborting suspend\n");
> + log_ws_wakeup_reason();
> pm_print_active_wakeup_sources();
> }
>
> @@ -947,11 +949,15 @@ void pm_wakeup_clear(bool reset)
> pm_wakeup_irq = 0;
> if (reset)
> atomic_set(&pm_abort_suspend, 0);
> +
> + clear_wakeup_reason();
> }
>
> void pm_system_irq_wakeup(unsigned int irq_number)
> {
> if (pm_wakeup_irq == 0) {
> + log_irq_wakeup_reason(irq_number);
> +
> pm_wakeup_irq = irq_number;
> pm_system_wakeup();
> }
> diff --git a/include/linux/wakeup_reason.h b/include/linux/wakeup_reason.h
> new file mode 100644
> index 000000000000..8d6e7a0e0bad
> --- /dev/null
> +++ b/include/linux/wakeup_reason.h
> @@ -0,0 +1,35 @@
> +/*
> + * include/linux/wakeup_reason.h

These have a tendency to go out of date.

> + * Logs the reason which caused the kernel to resume
> + * from the suspend mode.
> + *
> + * Copyright (C) 2021 Linaro, Inc.

You can't just take Copyright{ed,written} code wholesale.

You need to keep the original Google one.

> + * This software is licensed under the terms of the GNU General Public
> + * License version 2, as published by the Free Software Foundation, and
> + * may be copied, distributed, and modified under those terms.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.

Replace with SPDX.

> + */

[...]

> +++ b/kernel/power/wakeup_reason.c
> @@ -0,0 +1,183 @@
> +/*
> + * driver/base/power/wakeup_reason.c
> + *
> + * Logs the reasons which caused the kernel to resume from
> + * the suspend mode.
> + *
> + * Copyright (C) 2021 Linaro, Inc.
> + * This software is licensed under the terms of the GNU General Public
> + * License version 2, as published by the Free Software Foundation, and
> + * may be copied, distributed, and modified under those terms.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + */

As above.

> +#include <linux/kernel.h>
> +#include <linux/irq.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/init.h>
> +#include <linux/spinlock.h>
> +#include <linux/notifier.h>
> +#include <linux/suspend.h>
> +#include <linux/wakeup_reason.h>

Alphabetical.

[...]

> +ssize_t log_ws_wakeup_reason(void)
> +{
> + struct wakeup_source *ws, *last_active_ws = NULL;
> + int idx, max, len = 0;
> + bool active = false;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&wakeup_reason_lock, flags);
> +
> + if (!capture_reasons) {
> + goto out;
> + }

Over-bracketing.

[...]

> +EXPORT_SYMBOL(log_ws_wakeup_reason);

wakeup_reason_* might be better/clearer way to namespace.

> +ssize_t log_irq_wakeup_reason(unsigned int irq_number)
> +{
> + int len = 0;

Smaller data types (int, char, bool) usually go at the bottom below
larger ones (struct *).

> + struct irq_desc *desc;
> + const char *name = "null";
> + unsigned long flags;
> +
> + desc = irq_to_desc(irq_number);
> + if (desc == NULL)

if (!desc)

[...]

> +ssize_t last_wakeup_reason_get(char *buf, ssize_t max)
> +{
> + ssize_t len, size = 0;
> + unsigned long flags;
> +
> + if (!buf) {
> + return 0;
> + }

Over-bracketing. I won't keep reporting on this.

--
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog