Re: 2.0, 2.2, 2.4, 2.5: fsync buffer race

From: Mikulas Patocka (mikulas@artax.karlin.mff.cuni.cz)
Date: Wed Feb 05 2003 - 10:13:19 EST


> Hi!
>
> > > there's a race condition in filesystem
> > >
> > > let's have a two inodes that are placed in the same buffer.
> > >
> > > call fsync on inode 1
> > > it goes down to ext2_update_inode [update == 1]
> > > it calls ll_rw_block at the end
> > > ll_rw_block starts to write buffer
> > > ext2_update_inode waits on buffer
> > >
> > > while the buffer is writing, another process calls fsync on inode 2
> > > it goes again to ext2_update_inode
> > > it calls ll_rw_block
> > > ll_rw_block sees buffer locked and exits immediatelly
> > > ext2_update_inode waits for buffer
> > > the first write finished, ext2_update_inode exits and changes made by
> > > second proces to inode 2 ARE NOT WRITTEN TO DISK.
> > >
> >
> > hmm, yes. This is a general weakness in the ll_rw_block() interface. It is
> > not suitable for data-integrity writeouts, as you've pointed out.
> >
> > A suitable fix would be do create a new
> >
> > void wait_and_rw_block(...)
> > {
> > wait_on_buffer(bh);
> > ll_rw_block(...);
> > }
> >
> > and go use that in all the appropriate places.
> >
> > I shall make that change for 2.5, thanks.
>
> Should this be fixed at least in 2.4, too? It seems pretty serious for
> mail servers (etc)...
> Pavel

It should, but it is a hazard. The problem is that every use of
ll_rw_block has this bug, not only the one in ext2 fsync. The most clean
thing would be to modify ll_rw_block to wait until buffer becomes
unlocked, no one knows if it can produce some weird things.

Even Linus didn't know what he was doing, see this comment around the
buggy part in 2.2, 2.0 and previous kernels.

ll_rw_blk.c:
        /* Uhhuh.. Nasty dead-lock possible here.. */
        if (buffer_locked(bh))
                return;
        /* Maybe the above fixes it, and maybe it doesn't boot. Life is
interesting */
        lock_buffer(bh);

Mikulas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Feb 07 2003 - 22:00:17 EST