Re: [PATCH] [PATCH] mm: disable preemption before swapcache_free

From: Andrew Morton
Date: Wed Jul 25 2018 - 17:16:47 EST

On Wed, 25 Jul 2018 14:37:58 +0800 "zhaowuyun@xxxxxxxxxxxx" <zhaowuyun@xxxxxxxxxxxx> wrote:

> From: zhaowuyun <zhaowuyun@xxxxxxxxxxxx>
> issue is that there are two processes A and B, A is kworker/u16:8
> normal priority, B is AudioTrack, RT priority, they are on the
> same CPU 3.
> The task A preempted by task B in the moment
> after __delete_from_swap_cache(page) and before swapcache_free(swap).
> The task B does __read_swap_cache_async in the do {} while loop, it
> will never find the page from swapper_space because the page is removed
> by the task A, and it will never sucessfully in swapcache_prepare because
> the entry is EEXIST.
> The task B then stuck in the loop infinitely because it is a RT task,
> no one can preempt it.
> so need to disable preemption until the swapcache_free executed.

Yes, right, sorry, I must have merged cbab0e4eec299 in my sleep.
cond_resched() is a no-op in the presence of realtime policy threads
and using to attempt to yield to a different thread it in this fashion
is broken.

Disabling preemption on the other side of the race should fix things,
but it's using a bandaid to plug the leakage from the earlier bandaid.
The proper way to coordinate threads is to use a sleeping lock, such
as a mutex, or some other wait/wakeup mechanism.

And once that's done, we can hopefully eliminate the do loop from
__read_swap_cache_async(). That also services ENOMEM from
radix_tree_insert(), but __add_to_swap_cache() appears to handle that
OK and we shouldn't just loop around retrying the insert and the
radix_tree_preload() should ensure that radix_tree_insert() never fails
anyway. Unless we're calling __read_swap_cache_async() with screwy
gfp_flags from somewhere.