Re: [PATCH v3] sched/debug: Use sched_debug_lock to serialize use of cgroup_path[] only

From: Waiman Long
Date: Fri Apr 02 2021 - 23:09:28 EST


On 4/2/21 4:40 PM, Steven Rostedt wrote:
On Thu, 1 Apr 2021 14:10:30 -0400
Waiman Long <longman@xxxxxxxxxx> wrote:

The handling of sysrq key can be activated by echoing the key to
/proc/sysrq-trigger or via the magic key sequence typed into a terminal
that is connected to the system in some way (serial, USB or other mean).
In the former case, the handling is done in a user context. In the
latter case, it is likely to be in an interrupt context.

There should be no more than one instance of sysrq key processing via
a terminal, but multiple instances of /proc/sysrq-trigger is possible.

Currently in print_cpu() of kernel/sched/debug.c, sched_debug_lock is
taken with interrupt disabled for the whole duration of the calls to
print_*_stats() and print_rq() which could last for the quite some time
if the information dump happens on the serial console.

If the system has many cpus and the sched_debug_lock is somehow busy
(e.g. parallel sysrq-t), the system may hit a hard lockup panic
depending on the actually serial console implementation of the
system. For instance,
Wouldn't placing strategically located "touch_nmi_watchdog()"s around fix
this?

-- Steve

The main problem with sched_debug_lock is that under certain circumstances, a lock waiter may wait a long time to acquire the lock (in seconds). We can't insert touch_nmi_watchdog() while the cpu is waiting for the spinlock.

Cheers,
Longman