Re: INFO: rcu_sched detected stalls on CPUs/tasks with `kswapd` and `mem_cgroup_shrink_node`

From: Michal Hocko
Date: Wed Nov 30 2016 - 06:10:19 EST


[CCing Paul]

On Wed 30-11-16 11:28:34, Donald Buczek wrote:
[...]
> shrink_active_list gets and releases the spinlock and calls cond_resched().
> This should give other tasks a chance to run. Just as an experiment, I'm
> trying
>
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1921,7 +1921,7 @@ static void shrink_active_list(unsigned long
> nr_to_scan,
> spin_unlock_irq(&pgdat->lru_lock);
>
> while (!list_empty(&l_hold)) {
> - cond_resched();
> + cond_resched_rcu_qs();
> page = lru_to_page(&l_hold);
> list_del(&page->lru);
>
> and didn't hit a rcu_sched warning for >21 hours uptime now. We'll see.

This is really interesting! Is it possible that the RCU stall detector
is somehow confused?

> Is preemption disabled for another reason?

I do not think so. I will have to double check the code but this is a
standard sleepable context. Just wondering what is the PREEMPT
configuration here?
--
Michal Hocko
SUSE Labs