[PATCH 1/2] shmem: drop lru_add_drain_all from shmem_wait_for_pins

From: Michal Hocko
Date: Thu Nov 02 2017 - 05:37:00 EST


From: Michal Hocko <mhocko@xxxxxxxx>

syzkaller has reported the following lockdep splat
======================================================
WARNING: possible circular locking dependency detected
4.13.0-next-20170911+ #19 Not tainted
------------------------------------------------------
syz-executor5/6914 is trying to acquire lock:
(cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff818c1b3e>] get_online_cpus include/linux/cpu.h:126 [inline]
(cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff818c1b3e>] lru_add_drain_all+0xe/0x20 mm/swap.c:729

but task is already holding lock:
(&sb->s_type->i_mutex_key#9){++++}, at: [<ffffffff818fbef7>] inode_lock include/linux/fs.h:712 [inline]
(&sb->s_type->i_mutex_key#9){++++}, at: [<ffffffff818fbef7>] shmem_add_seals+0x197/0x1060 mm/shmem.c:2768

more details [1] and dependencies explained [2]. The problem seems to be
the usage of lru_add_drain_all from shmem_wait_for_pins. While the lock
dependency is subtle as hell and we might want to make lru_add_drain_all
less dependent on the hotplug locks the usage of lru_add_drain_all seems
dubious here. The whole function cares only about radix tree tags, page
count and page mapcount. None of those are touched from the draining
context. So it doesn't make much sense to drain pcp caches. Moreover
this looks like a wrong thing to do because it basically induces
unpredictable latency to the call because draining is not for free
(especially on larger machines with many cpus).

Let's simply drop the call to lru_add_drain_all to address both issues.

[1] http://lkml.kernel.org/r/089e0825eec8955c1f055c83d476@xxxxxxxxxx
[2] http://lkml.kernel.org/r/http://lkml.kernel.org/r/20171030151009.ip4k7nwan7muouca@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Cc: David Herrmann <dh.herrmann@xxxxxxxxx>
Cc: Hugh Dickins <hughd@xxxxxxxxxx>
Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
---
mm/shmem.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index d6947d21f66c..e784f311d4ed 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2668,9 +2668,7 @@ static int shmem_wait_for_pins(struct address_space *mapping)
if (!radix_tree_tagged(&mapping->page_tree, SHMEM_TAG_PINNED))
break;

- if (!scan)
- lru_add_drain_all();
- else if (schedule_timeout_killable((HZ << scan) / 200))
+ if (scan && schedule_timeout_killable((HZ << scan) / 200))
scan = LAST_SCAN;

start = 0;
--
2.14.2