Re: [PATCH] kernel/hung_task.c: allow to set period separately from timeout

From: Dmitry Vyukov
Date: Mon Jun 11 2018 - 07:16:45 EST

On Sat, Jun 9, 2018 at 9:00 AM, Tetsuo Handa
<penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
> On 2018/06/09 6:58, Andrew Morton wrote:
>> On Fri, 8 Jun 2018 15:30:43 +0200 Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>>> Currently task hung checking period is equal to timeout,
>>> as the result hung is detected anywhere between timeout and 2*timeout.
>>> This is fine for most interactive environments, but this hurts automated
>>> testing setups (syzbot). In an automated setup we need to strictly order
>>> CPU lockup < RCU stall < workqueue lockup < task hung < silent loss,
>>> so that RCU stall is not detected as task hung and task hung is not
>>> detected as silent machine loss. The large variance in task hung
>>> detection timeout requires setting silent machine loss timeout to
>>> a very large value (e.g. if task hung is 3 mins, then silent loss
>>> need to be set to ~7 mins). The additional 3 minutes significantly
>>> reduce testing efficiency because usually we crash kernel within
>>> a minute, and this can add hours to bug localization process as it
>>> needs to do dozens of tests.
>>> Allow setting checking period separately from timeout.
>>> This allows to set timeout to, say, 3 minutes, but period to 10 secs.
>>> The period is controlled via a new hung_task_period_secs sysctl,
>>> similar to the existing hung_task_timeout_secs sysctl.
>>> The default value of 0 results in the current behavior.
>> I'm rather struggling to understand the difference between "period" and
>> "timeout". We would benefit from a clear description of what these two
>> things do. An appropriate place for this description is
>> Documentation/sysctl/kernel.txt, which this patch forgot to update.
> My understanding is that "period" is "how frequently we should check"
> and "timeout" is "how long a thread remained uninterruptible". Maybe
> hung_task_check_interval_secs would be better than hung_task_period_secs.

Hi Tetsuo, Andrew,

I've just mailed v2:

Changes since v1:
- add entry to Documentation/sysctl/kernel.txt
- rename hung_task_period_secs sysctl to hung_task_check_interval_sec

Hopefully now it's more clear what's the difference and what it is doing.

> timeout = 60 and period = 1 would allow hung task to be reported as soon
> as it remained uninterruptible for 60 seconds. That makes me easier to
> narrow down relevant kernel messages and syzbot program.
> Well, showing exact slept time, along with all threads which slept more
> than some threshold (e.g. timeout / 2), might be helpful.

You mean if we report any task, then scan all tasks second time and
additionally report tasks that are blocked for (timeout/2 : timeout)?

Should we do this when hung_task_show_lock? Or only when
sysctl_hung_task_panic? Or when?