[PATCH 25/25] generic_file_buffered_write(): deadlock on vectored write

From: Chris Wright
Date: Tue Jun 27 2006 - 16:20:01 EST


-stable review patch. If anyone has any objections, please let us know.
------------------

From: Vladimir V. Saveliev <vs@xxxxxxxxxxx>

generic_file_buffered_write() prefaults in user pages in order to avoid
deadlock on copying from the same page as write goes to.

However, it looks like there is a problem when write is vectored:
fault_in_pages_readable brings in current segment or its part (maxlen).
OTOH, filemap_copy_from_user_iovec is called to copy number of bytes
(bytes) which may exceed current segment, so filemap_copy_from_user_iovec
switches to the next segment which is not brought in yet. Pagefault is
generated. That causes the deadlock if pagefault is for the same page
write goes to: page being written is locked and not uptodate, pagefault
will deadlock trying to lock locked page.

[akpm@xxxxxxxx: somewhat rewritten]
Cc: Neil Brown <neilb@xxxxxxx>
Cc: Martin Schwidefsky <schwidefsky@xxxxxxxxxx>
Cc: <stable@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxx>
Signed-off-by: Chris Wright <chrisw@xxxxxxxxxxxx>
---

mm/filemap.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)

--- linux-2.6.17.1.orig/mm/filemap.c
+++ linux-2.6.17.1/mm/filemap.c
@@ -2004,14 +2004,21 @@ generic_file_buffered_write(struct kiocb
do {
unsigned long index;
unsigned long offset;
- unsigned long maxlen;
size_t copied;

offset = (pos & (PAGE_CACHE_SIZE -1)); /* Within page */
index = pos >> PAGE_CACHE_SHIFT;
bytes = PAGE_CACHE_SIZE - offset;
- if (bytes > count)
- bytes = count;
+
+ /* Limit the size of the copy to the caller's write size */
+ bytes = min(bytes, count);
+
+ /*
+ * Limit the size of the copy to that of the current segment,
+ * because fault_in_pages_readable() doesn't know how to walk
+ * segments.
+ */
+ bytes = min(bytes, cur_iov->iov_len - iov_base);

/*
* Bring in the user page that we will copy from _first_.
@@ -2019,10 +2026,7 @@ generic_file_buffered_write(struct kiocb
* same page as we're writing to, without it being marked
* up-to-date.
*/
- maxlen = cur_iov->iov_len - iov_base;
- if (maxlen > bytes)
- maxlen = bytes;
- fault_in_pages_readable(buf, maxlen);
+ fault_in_pages_readable(buf, bytes);

page = __grab_cache_page(mapping,index,&cached_page,&lru_pvec);
if (!page) {

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/