Re: INFO: rcu_sched detected stalls on CPUs/tasks with `kswapd` and `mem_cgroup_shrink_node`

From: Donald Buczek
Date: Wed Nov 30 2016 - 06:44:06 EST


On 11/30/16 12:09, Michal Hocko wrote:
[CCing Paul]

On Wed 30-11-16 11:28:34, Donald Buczek wrote:
[...]
shrink_active_list gets and releases the spinlock and calls cond_resched().
This should give other tasks a chance to run. Just as an experiment, I'm
trying

--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1921,7 +1921,7 @@ static void shrink_active_list(unsigned long
nr_to_scan,
spin_unlock_irq(&pgdat->lru_lock);

while (!list_empty(&l_hold)) {
- cond_resched();
+ cond_resched_rcu_qs();
page = lru_to_page(&l_hold);
list_del(&page->lru);

and didn't hit a rcu_sched warning for >21 hours uptime now. We'll see.
This is really interesting! Is it possible that the RCU stall detector
is somehow confused?

Wait... 21 hours is not yet a test result.

Is preemption disabled for another reason?
I do not think so. I will have to double check the code but this is a
standard sleepable context. Just wondering what is the PREEMPT
configuration here?

buczek@null:~$ zcat /proc/config.gz |grep PREE
CONFIG_PREEMPT_NOTIFIERS=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set

Thanks
Donald

--
Donald Buczek
buczek@xxxxxxxxxxxxx
Tel: +49 30 8413 1433