[PATCH] [51/275] NFS: Fix "kernel BUG at fs/aio.c:554!"

From: Andi Kleen
Date: Wed Mar 30 2011 - 18:01:48 EST


2.6.35-longterm review patch. If anyone has any objections, please let me know.

------------------
From: Chuck Lever <chuck.lever@xxxxxxxxxx>

commit 839f7ad6932d95f4d5ae7267b95c574714ff3d5b upstream.

Nick Piggin reports:

> I'm getting use after frees in aio code in NFS
>
> [ 2703.396766] Call Trace:
> [ 2703.396858] [<ffffffff8100b057>] ? native_sched_clock+0x27/0x80
> [ 2703.396959] [<ffffffff8108509e>] ? put_lock_stats+0xe/0x40
> [ 2703.397058] [<ffffffff81088348>] ? lock_release_holdtime+0xa8/0x140
> [ 2703.397159] [<ffffffff8108a2a5>] lock_acquire+0x95/0x1b0
> [ 2703.397260] [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
> [ 2703.397361] [<ffffffff81039701>] ? get_parent_ip+0x11/0x50
> [ 2703.397464] [<ffffffff81612a31>] _raw_spin_lock_irq+0x41/0x80
> [ 2703.397564] [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
> [ 2703.397662] [<ffffffff811627db>] aio_put_req+0x2b/0x60
> [ 2703.397761] [<ffffffff811647fe>] do_io_submit+0x2be/0x7c0
> [ 2703.397895] [<ffffffff81164d0b>] sys_io_submit+0xb/0x10
> [ 2703.397995] [<ffffffff8100307b>] system_call_fastpath+0x16/0x1b
>
> Adding some tracing, it is due to nfs completing the request then
> returning something other than -EIOCBQUEUED, so aio.c
> also completes the request.

To address this, prevent the NFS direct I/O engine from completing
async iocbs when the forward path returns an error without starting
any I/O.

This fix appears to survive ^C during both "xfstest no. 208" and "fsx
-Z."

It's likely this bug has existed for a very long while, as we are seeing
very similar symptoms in OEL 5. Copying stable.

Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxx>
Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>

---
fs/nfs/direct.c | 34 ++++++++++++++++++++--------------
1 file changed, 20 insertions(+), 14 deletions(-)

Index: linux-2.6.35.y/fs/nfs/direct.c
===================================================================
--- linux-2.6.35.y.orig/fs/nfs/direct.c 2011-03-29 22:51:49.201464226 -0700
+++ linux-2.6.35.y/fs/nfs/direct.c 2011-03-29 23:02:59.231319840 -0700
@@ -402,15 +402,18 @@
pos += vec->iov_len;
}

+ /*
+ * If no bytes were started, return the error, and let the
+ * generic layer handle the completion.
+ */
+ if (requested_bytes == 0) {
+ nfs_direct_req_release(dreq);
+ return result < 0 ? result : -EIO;
+ }
+
if (put_dreq(dreq))
nfs_direct_complete(dreq);
-
- if (requested_bytes != 0)
- return 0;
-
- if (result < 0)
- return result;
- return -EIO;
+ return 0;
}

static ssize_t nfs_direct_read(struct kiocb *iocb, const struct iovec *iov,
@@ -830,15 +833,18 @@
pos += vec->iov_len;
}

+ /*
+ * If no bytes were started, return the error, and let the
+ * generic layer handle the completion.
+ */
+ if (requested_bytes == 0) {
+ nfs_direct_req_release(dreq);
+ return result < 0 ? result : -EIO;
+ }
+
if (put_dreq(dreq))
nfs_direct_write_complete(dreq, dreq->inode);
-
- if (requested_bytes != 0)
- return 0;
-
- if (result < 0)
- return result;
- return -EIO;
+ return 0;
}

static ssize_t nfs_direct_write(struct kiocb *iocb, const struct iovec *iov,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/