[PATCH 2/4] fscache: Fix race in fscache_op_complete() due to split atomic_sub & read

From: David Howells
Date: Wed Oct 17 2018 - 10:23:43 EST


From: kiran.modukuri <kiran.modukuri@xxxxxxxxx>

The code in fscache_retrieval_complete is using atomic_sub followed by an
atomic_read:

atomic_sub(n_pages, &op->n_pages);
if (atomic_read(&op->n_pages) <= 0)
fscache_op_complete(&op->op, true);

This causes two threads doing a decrement of n_pages to race with each
other seeing the op->refcount 0 at same time - and they end up calling
fscache_op_complete() in both the threads leading to an assertion failure.

Fix this by using atomic_sub_return() instead of two calls.

The oops looks something like:

FS-Cache: Assertion failed
FS-Cache: 0 > 0 is false
...
kernel BUG at /usr/src/linux-4.4.0/fs/fscache/operation.c:449!
...
Workqueue: fscache_operation fscache_op_work_func [fscache]
...
RIP: 0010:[<ffffffffc037eacd>] fscache_op_complete+0x10d/0x180 [fscache]
...
Call Trace:
[<ffffffffc1464cf9>] cachefiles_read_copier+0x3a9/0x410 [cachefiles]
[<ffffffffc037e272>] fscache_op_work_func+0x22/0x50 [fscache]
[<ffffffff81096da0>] process_one_work+0x150/0x3f0
[<ffffffff8109751a>] worker_thread+0x11a/0x470
[<ffffffff81808e59>] ? __schedule+0x359/0x980
[<ffffffff81097400>] ? rescuer_thread+0x310/0x310
[<ffffffff8109cdd6>] kthread+0xd6/0xf0
[<ffffffff8109cd00>] ? kthread_park+0x60/0x60
[<ffffffff8180d0cf>] ret_from_fork+0x3f/0x70
[<ffffffff8109cd00>] ? kthread_park+0x60/0x60

This seen this in 4.4.x kernels and the same bug affects fscache in latest
upstreams kernels.

Fixes: 1bb4b7f98f36 ("FS-Cache: The retrieval remaining-pages counter needs to be atomic_t")
Signed-off-by: Kiran Kumar Modukuri <kiran.modukuri@xxxxxxxxx>
Signed-off-by: David Howells <dhowells@xxxxxxxxxx>
---

include/linux/fscache-cache.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 34cf0fdd7dc7..bf98ed803af2 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -196,11 +196,11 @@ static inline void fscache_enqueue_retrieval(struct fscache_retrieval *op)
static inline void fscache_retrieval_complete(struct fscache_retrieval *op,
int n_pages)
{
- atomic_sub(n_pages, &op->n_pages);
- if (atomic_read(&op->n_pages) <= 0)
+ if (atomic_sub_return(n_pages, &op->n_pages) <= 0)
fscache_op_complete(&op->op, false);
}

+
/**
* fscache_put_retrieval - Drop a reference to a retrieval operation
* @op: The retrieval operation affected