Re: [PATCH] SMP race in ext2 - metadata corruption.

From: Linus Torvalds (torvalds@transmeta.com)
Date: Thu Apr 26 2001 - 13:49:14 EST


On Thu, Apr 26, 2001 at 11:45:47AM -0400, Alexander Viro wrote:
>
> Ext2 does getblk+wait_on_buffer for new metadata blocks before
> filling them with zeroes. While that is enough for single-processor,
> on SMP we have the following race:
>
> getblk gives us unlocked, non-uptodate bh
> wait_on_buffer() does nothing
> read from device locks it and starts IO

I see the race, but I don't see how you can actually trigger it.

Exactly _who_ does the "read from device" part? Somebody doing a
"fsck" while the filesystem is mounted read-write and actively written
to? Yeah, you'd get disk corruption that way, but you'll get it regardless
of this bug.

There's nothing else that should be using that block at that stage. And if
there were, that would be a bug in itself, as far as I can tell. We've
just allocated it, and we're the only and exclusive owners of that block
on the disk. Anybody else who touches it is seriously broken.

Now, I don't disagree with your patch (it's just obviously cleaner to lock
it properly), but I don't think this is a real bug. I suspect that even
the wait-on-buffer is not strictly necessary: it's probably there to make
sure old write-backs have completed, but that doesn't really matter
either.

We used to have "breada()" do physical read-ahead that could have
triggered this, but we've long since gotten rid of that.

Or am I overlooking something?

                        Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Apr 30 2001 - 21:00:16 EST