RE: rcu_preempt caused oom

From: He, Bo
Date: Mon Dec 17 2018 - 22:13:08 EST


check with jun:
the scenario is more like:
@@@rcu_start_this_gp@@@ start after ___swait_event before schedule
rcu_gp_kthread--> swait_event_idle_exclusive--> __swait_event_idle--> ___swait_event--------->schedule
@@@ rcu_gp_kthread_wake skip wakeup in rcu_gp_kthread

then rcu_gp_kthread will sleep and can't wake up.

Jun's patch can workaround it, what's your ideas?


-----Original Message-----
From: Zhang, Jun
Sent: Tuesday, December 18, 2018 10:47 AM
To: He, Bo <bo.he@xxxxxxxxx>; paulmck@xxxxxxxxxxxxx
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; josh@xxxxxxxxxxxxxxxx; mathieu.desnoyers@xxxxxxxxxxxx; jiangshanlai@xxxxxxxxx; Xiao, Jin <jin.xiao@xxxxxxxxx>; Zhang, Yanmin <yanmin.zhang@xxxxxxxxx>; Bai, Jie A <jie.a.bai@xxxxxxxxx>; Sun, Yi J <yi.j.sun@xxxxxxxxx>; Chang, Junxiao <junxiao.chang@xxxxxxxxx>; Mei, Paul <paul.mei@xxxxxxxxx>
Subject: RE: rcu_preempt caused oom

Hello, paul

In softirq context, and current is rcu_preempt-10, rcu_gp_kthread_wake don't wakeup rcu_preempt.
Maybe next patch could fix it. Please help review.

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 0b760c1..98f5b40 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1697,7 +1697,7 @@ static bool rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
*/
static void rcu_gp_kthread_wake(struct rcu_state *rsp) {
- if (current == rsp->gp_kthread ||
+ if (((current == rsp->gp_kthread) && !in_softirq()) ||
!READ_ONCE(rsp->gp_flags) ||
!rsp->gp_kthread)
return;

[44932.311439, 0][ rcu_preempt] rcu_preempt-10 [001] .n.. 44929.401037: rcu_grace_period: rcu_preempt 19063548 reqwait
......
[44932.311517, 0][ rcu_preempt] rcu_preempt-10 [001] d.s2 44929.402234: rcu_future_grace_period: rcu_preempt 19063548 19063552 0 0 3 Startleaf
[44932.311536, 0][ rcu_preempt] rcu_preempt-10 [001] d.s2 44929.402237: rcu_future_grace_period: rcu_preempt 19063548 19063552 0 0 3 Startedroot


-----Original Message-----
From: He, Bo
Sent: Tuesday, December 18, 2018 07:16
To: paulmck@xxxxxxxxxxxxx
Cc: Zhang, Jun <jun.zhang@xxxxxxxxx>; Steven Rostedt <rostedt@xxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; josh@xxxxxxxxxxxxxxxx; mathieu.desnoyers@xxxxxxxxxxxx; jiangshanlai@xxxxxxxxx; Xiao, Jin <jin.xiao@xxxxxxxxx>; Zhang, Yanmin <yanmin.zhang@xxxxxxxxx>; Bai, Jie A <jie.a.bai@xxxxxxxxx>; Sun, Yi J <yi.j.sun@xxxxxxxxx>; Chang, Junxiao <junxiao.chang@xxxxxxxxx>; Mei, Paul <paul.mei@xxxxxxxxx>
Subject: RE: rcu_preempt caused oom

Thanks for your comments, the issue could be panic with the change if (ret == 1). Here enclosed are the logs.

-----Original Message-----
From: Paul E. McKenney <paulmck@xxxxxxxxxxxxx>
Sent: Monday, December 17, 2018 12:26 PM
To: He, Bo <bo.he@xxxxxxxxx>
Cc: Zhang, Jun <jun.zhang@xxxxxxxxx>; Steven Rostedt <rostedt@xxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; josh@xxxxxxxxxxxxxxxx; mathieu.desnoyers@xxxxxxxxxxxx; jiangshanlai@xxxxxxxxx; Xiao, Jin <jin.xiao@xxxxxxxxx>; Zhang, Yanmin <yanmin.zhang@xxxxxxxxx>; Bai, Jie A <jie.a.bai@xxxxxxxxx>; Sun, Yi J <yi.j.sun@xxxxxxxxx>; Chang, Junxiao <junxiao.chang@xxxxxxxxx>; Mei, Paul <paul.mei@xxxxxxxxx>
Subject: Re: rcu_preempt caused oom

On Mon, Dec 17, 2018 at 03:15:42AM +0000, He, Bo wrote:
> for double confirm the issue is not reproduce after 90 hours, we tried only add the enclosed patch on the easy reproduced build, the issue is not reproduced after 63 hours in the whole weekend on 16 boards.
> so current conclusion is the debug patch has extreme effect on the rcu issue.

This is not a surprise. (Please see the end of this email for a replacement patch that won't suppress the bug.)

To see why this is not a surprise, let's take a closer look at your patch, in light of the comment header for wait_event_idle_timeout_exclusive():

* Returns:
* 0 if the @condition evaluated to %false after the @timeout elapsed,
* 1 if the @condition evaluated to %true after the @timeout elapsed,
* or the remaining jiffies (at least 1) if the @condition evaluated
* to %true before the @timeout elapsed.

The situation we are seeing is that the RCU_GP_FLAG_INIT is set, but the rcu_preempt task does not wake up. This would correspond to the second case above, that is, a return value of 1. Looking now at your patch, with comments interspersed below:

------------------------------------------------------------------------