Re: [PATCH v2] sched/debug: Use sched_debug_lock to serialize use of cgroup_path[] only

From: Daniel Thompson
Date: Tue Mar 30 2021 - 06:43:35 EST


On Mon, Mar 29, 2021 at 03:32:35PM -0400, Waiman Long wrote:
> The handling of sysrq keys should normally be done in an user context
> except when MAGIC_SYSRQ_SERIAL is set and the magic sequence is typed
> in a serial console.

This seems to be a poor summary of the typical calling context for
handle_sysrq() except in the trivial case of using
/proc/sysrq-trigger.

For example on my system then the backtrace when I do sysrq-h on a USB
keyboard shows us running from a softirq handler and with interrupts
locked. Note also that the interrupt lock is present even on systems that
handle keyboard input from a kthread due to the interrupt lock in
report_input_key().


> Currently in print_cpu() of kernel/sched/debug.c, sched_debug_lock is taken
> with interrupt disabled for the whole duration of the calls to print_*_stats()
> and print_rq() which could last for the quite some time if the information dump
> happens on the serial console.
>
> If the system has many cpus and the sched_debug_lock is somehow busy
> (e.g. parallel sysrq-t), the system may hit a hard lockup panic, like

<snip>

> The purpose of sched_debug_lock is to serialize the use of the global
> cgroup_path[] buffer in print_cpu(). The rests of the printk() calls
> don't need serialization from sched_debug_lock.
>
> Calling printk() with interrupt disabled can still be
> problematic. Allocating a stack buffer of PATH_MAX bytes is not
> feasible. So a compromised solution is used where a small stack buffer
> is allocated for pathname. If the actual pathname is short enough, it
> is copied to the stack buffer with sched_debug_lock release afterward
> before printk(). Otherwise, the global group_path[] buffer will be
> used with sched_debug_lock held until after printk().

Does this actually fix the problem in any circumstance except when the
sysrq is triggered using /proc/sysrq-trigger?


Daniel.
>