Re: [PATCH] locking/hung_task: Defer showing held locks

From: Tetsuo Handa
Date: Tue Dec 20 2016 - 08:34:58 EST


Vegard Nossum wrote:
> On 13 December 2016 at 15:45, Tetsuo Handa
> <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
> > When I was running my testcase which may block hundreds of threads
> > on fs locks, I got lockup due to output from debug_show_all_locks()
> > added by commit b2d4c2edb2e4f89a ("locking/hung_task: Show all locks").
> >
> > I think we don't need to call debug_show_all_locks() on each blocked
> > thread. Let's defer calling debug_show_all_locks() till before panic()
> > or leaving for_each_process_thread() loop.
>
> First of all, sorry for not answering earlier.

No problem.

>
> I'm not sure I fully understand the problem, you say the "output from
> debug_show_all_locks()" caused a lockup, but was the problem simply
> that the amount of output caused it to stall for a long time?

In Linux 4.9, in order to tell administrator that something might be wrong
with memory allocation, warn_alloc() which calls printk() periodically
when memory allocation is stalling for too long was added. However, since
printk() waits until all pending data is sent to console using cond_resched(),
printk() continues waiting as long as somebody else calls printk() when
cond_resched() is called. This is problematic under OOM situation.

Since the OOM killer calls printk() with oom_lock held, it happened that
printk() called from the OOM killer is forever unable to return because
warn_alloc() periodically calls printk() since the OOM killer is holding
oom_lock.

And it happened that khungtaskd is another source which calls printk()
periodically when threads are blocked on fs locks waiting for memory
allocation. debug_show_all_locks() generates far more amount of output
compared to warn_alloc() if debug_show_all_locks() is called on each
thread blocked on fs locks waiting for memory allocation. Therefore,
we should avoid calling debug_show_all_locks() on each blocked thread.

Full story starts at http://lkml.kernel.org/r/1481020439-5867-1-git-send-email-penguin-kernel@xxxxxxxxxxxxxxxxxxx but
I appreciate if you can join on http://lkml.kernel.org/r/1478416501-10104-1-git-send-email-penguin-kernel@xxxxxxxxxxxxxxxxxxx .

>
> Could we instead
>
> 1) move the debug_show_all_locks() into the if
> (sysctl_hung_task_panic) bit unconditionally
>
> 2) call something (touch_nmi_watchdog()?) inside debug_show_all_locks()
>
> 3) in another way make debug_show_all_locks() more robust so it doesn't "lockup"
>
> ?

Yes, that might be an improvement. But not needed for this patch.