Re: [patch 4/19] stack space reduction (remove MAX_BUF_PER_PAGE)

From: Andrew Morton (akpm@zip.com.au)
Date: Mon Jun 17 2002 - 04:35:37 EST


Andrew Morton wrote:
>
> ..
> + do {
> + if (buffer_async_read(bh))
> + submit_bh(READ, bh);
> + } while ((bh = bh->b_this_page) != head);

That's a bug. We cannot touch bh->b_this_page after submitting
the buffer because the I/O could complete synchronously and the
page can come unlocked and the buffers can be stripped by the VM.
The buffer at `bh' could be dead memory.

This can happen on SMP with PIO-mode IDE disks.

We cannot fix this with the usual

        do {
                next = bh->b_this_page;
                if (something)
                        submit_bh(bh);
        } while ((bh = next) != head);

approach because the final buffer on the page could be unlocked,
clean and uptodate.

Pinning one of the buffers while we walk the ring will keep
try_to_free_buffers() away. Here is an incremental patch.

--- 2.5.22/fs/ntfs/aops.c~ntfs-race-fix Mon Jun 17 01:50:32 2002
+++ 2.5.22-akpm/fs/ntfs/aops.c Mon Jun 17 02:24:00 2002
@@ -220,10 +220,12 @@ handle_zblock:
                 } while ((bh = bh->b_this_page) != head);
 
                 /* Finally, start i/o on the buffers. */
+ get_bh(head); /* Pin the buffers while we walk the ring */
                 do {
                         if (buffer_async_read(bh))
                                 submit_bh(READ, bh);
                 } while ((bh = bh->b_this_page) != head);
+ put_bh(head);
                 return 0;
         }
         /* No i/o was scheduled on any of the buffers. */
@@ -510,10 +512,12 @@ handle_zblock:
                 } while ((bh = bh->b_this_page) != head);
 
                 /* Finally, start i/o on the buffers. */
+ get_bh(head); /* Pin the buffers while we walk the ring */
                 do {
                         if (buffer_async_read(bh))
                                 submit_bh(READ, bh);
                 } while ((bh = bh->b_this_page) != head);
+ put_bh(head);
                 return 0;
         }
         /* No i/o was scheduled on any of the buffers. */
@@ -774,10 +778,12 @@ handle_zblock:
                 } while ((bh = bh->b_this_page) != head);
 
                 /* Finally, start i/o on the buffers. */
+ get_bh(head); /* Pin the buffers while we walk the ring */
                 do {
                         if (buffer_async_read(bh))
                                 submit_bh(READ, bh);
                 } while ((bh = bh->b_this_page) != head);
+ put_bh(head);
                 return 0;
         }
         /* No i/o was scheduled on any of the buffers. */
--- 2.5.22/fs/buffer.c~ntfs-race-fix Mon Jun 17 01:55:55 2002
+++ 2.5.22-akpm/fs/buffer.c Mon Jun 17 02:24:33 2002
@@ -2024,7 +2024,12 @@ int block_read_full_page(struct page *pa
          * Stage 3: start the IO. Check for uptodateness
          * inside the buffer lock in case another process reading
          * the underlying blockdev brought it uptodate (the sct fix).
+ *
+ * Bump the refcount of one buffer while walking the ring so
+ * that the VM cannot release the buffers while we're looking at
+ * ->b_this_page.
          */
+ get_bh(head);
         do {
                 if (buffer_async_read(bh)) {
                         if (buffer_uptodate(bh))
@@ -2033,6 +2038,7 @@ int block_read_full_page(struct page *pa
                                 submit_bh(READ, bh);
                 }
         } while ((bh = bh->b_this_page) != head);
+ put_bh(head);
         return 0;
 }
 

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jun 23 2002 - 22:00:13 EST