Re: [PATCH] kernel/hung_task.c: Monitor killed tasks.

From: Tetsuo Handa
Date: Thu May 16 2019 - 04:21:59 EST


On 2019/05/15 19:55, Petr Mladek wrote:
>> + if (!stamp) {
>> + stamp = jiffies;
>> + if (!stamp)
>> + stamp++;
>> + t->killed_time = stamp;
>> + return;
>> + }
>
> I might be too dumb but the above code looks pretty tricky to me.
> It would deserve a comment. Or better, I would remove
> trick to handle overflow. If it happens, we would just
> lose one check period.

We can use

static inline unsigned long jiffies_nonzero(void)
{
const unsigned long stamp = jiffies;

return stamp ? stamp : -1;
}

or even shortcut "jiffies | 1" because difference by one jiffie
is an measurement error for multiple HZ of timeout.

>
> Alternative solution would be to set the timestamp in
> complete_signal(). Then we would know that the timestamp
> is always valid when a fatal signal is pending.

Yes. But I guess that since signal might be delivered just before
setting PF_FROZEN and the thread might be kept frozen for longer
than timeout, we will need to reset the timestamp just before
clearing PF_FROZEN.



>> + if (time_is_after_jiffies(stamp + timeout * HZ))
>> + return;
>> + trace_sched_process_hang(t);
>> + if (sysctl_hung_task_panic) {
>> + console_verbose();
>> + hung_task_call_panic = true;
>
> IMHO, the delayed task exit is much less fatal than sleeping
> in an uninterruptible state.
>
> Anyway, the check is much less reliable. In case of hung_task,
> it is enough when the task gets scheduled. In the new check,
> the task has to do some amount of work until the signal
> gets handled and do_exit() is called.
>
> The panic should either get enabled separately or we should
> never panic in this case.

OK, we should not share existing sysctl settings.

But in the context of syzbot's testing where there are only 2 CPUs
in the target VM (which means that only small number of threads and
not so much memory) and threads get SIGKILL after 5 seconds from fork(),
being unable to reach do_exit() within 10 seconds is likely a sign of
something went wrong. For example, 6 out of 7 trials of a reproducer for
https://syzkaller.appspot.com/bug?id=835a0b9e75b14b55112661cbc61ca8b8f0edf767
resulted in "no output from test machine" rather than "task hung".
This patch is revealing that such killed threads are failing to reach
do_exit() because they are trapped at unkillable retry loop due to a
race bug.

Therefore, I would like to try this patch in linux-next.git for feasibility
testing whether this patch helps finding more bugs and reproducers for such
bugs, by bringing "unable to terminate threads" reports out of "no output from
test machine" reports. We can add sysctl settings before sending to linux.git.

Any questions?