Re: [PATCH] Introduce a method to catch mmap_region (was: Recentkernel "mount" slow)

From: Linus Torvalds
Date: Wed Nov 28 2012 - 18:14:03 EST


On Wed, Nov 28, 2012 at 2:52 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
>> For example, __block_write_full_page and __block_write_begin do
>> if (!page_has_buffers(page)) { create_empty_buffers... }
>> and then they do
>> WARN_ON(bh->b_size != blocksize)
>> err = get_block(inode, block, bh, 1)
>
> Right. And none of this is new.

.. which, btw, is not to say that *other* things aren't new. They are.

The change to actually change the block device buffer size before then
calling "sync_bdev()" is definitely a real change, and as mentioned, I
have not tested the patch in any way. If any block device driver were
to actually compare the IO size they get against the bdev->block_size
thing, they'd see very different behavior (ie they'd see the new block
size as they are asked to write old the old blocks with the old block
size).

So it does change semantics, no question about that. I don't think any
block device does it, though.

A bigger issue is for things that emulate what blkdev.c does, and
doesn't do the locking. I see code in md/bitmap.c that seems a bit
suspicious, for example. That said, it's not *new* breakage, and the
"lock at mmap/read/write() time" approach doesn't fix it either (since
the mapping will be different for the underlying MD device). So I do
think that we should take a look at all the users of
"alloc_page_buffers()" and "create_empty_buffers()" to see what *they*
do to protect the block-size, but I think that's an independent issue
from the raw device access case in fs/block_dev.c..

I guess I have to actually test my patch. I don't have very
interesting test-cases, though.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/