Re: [PATCH] kernel/hung_task.c: Break RCU locks based on jiffies.

From: Paul E. McKenney
Date: Fri Dec 14 2018 - 10:58:19 EST


On Sat, Dec 15, 2018 at 12:17:38AM +0900, Tetsuo Handa wrote:
> check_hung_uninterruptible_tasks() is currently calling rcu_lock_break()
> for every 1024 threads. But check_hung_task() is very slow if printk()
> was called, and is very fast otherwise. If many threads within some 1024
> threads called printk(), the RCU grace period might be extended enough
> to trigger RCU stall warnings. Therefore, calling rcu_lock_break() for
> every some fixed jiffies will be safer.
>
> Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>

Acked-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxx>

> ---
> kernel/hung_task.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index cb8e3e8..444b8b5 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -34,7 +34,7 @@
> * is disabled during the critical section. It also controls the size of
> * the RCU grace period. So it needs to be upper-bound.
> */
> -#define HUNG_TASK_BATCHING 1024
> +#define HUNG_TASK_LOCK_BREAK (HZ / 10)
>
> /*
> * Zero means infinite timeout - no checking done:
> @@ -173,7 +173,7 @@ static bool rcu_lock_break(struct task_struct *g, struct task_struct *t)
> static void check_hung_uninterruptible_tasks(unsigned long timeout)
> {
> int max_count = sysctl_hung_task_check_count;
> - int batch_count = HUNG_TASK_BATCHING;
> + unsigned long last_break = jiffies;
> struct task_struct *g, *t;
>
> /*
> @@ -188,10 +188,10 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
> for_each_process_thread(g, t) {
> if (!max_count--)
> goto unlock;
> - if (!--batch_count) {
> - batch_count = HUNG_TASK_BATCHING;
> + if (time_after(jiffies, last_break + HUNG_TASK_LOCK_BREAK)) {
> if (!rcu_lock_break(g, t))
> goto unlock;
> + last_break = jiffies;
> }
> /* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */
> if (t->state == TASK_UNINTERRUPTIBLE)
> --
> 1.8.3.1
>