Re: [PATCH 17/19] fs: buffer: Do not use unnecessary atomic operations when discarding buffers

From: Mel Gorman
Date: Tue May 13 2014 - 10:46:53 EST


On Tue, May 13, 2014 at 04:01:27PM +0200, Peter Zijlstra wrote:
> On Tue, May 13, 2014 at 01:50:07PM +0100, Mel Gorman wrote:
> > > Anyway, nothing wrong with this patch, however, you could, if you really
> > > wanted to push things, also include BH_Lock in that clear :-)
> >
> > That's a bold strategy Cotton.
>
> :-)
>
> > Untested patch on top
> >
> > ---8<---
> > diff --git a/fs/buffer.c b/fs/buffer.c
> > index e80012d..42fcb6d 100644
> > --- a/fs/buffer.c
> > +++ b/fs/buffer.c
> > @@ -1490,6 +1490,8 @@ static void discard_buffer(struct buffer_head * bh)
> > lock_buffer(bh);
> > clear_buffer_dirty(bh);
> > bh->b_bdev = NULL;
> > +
> > + smp_mb__before_clear_bit();
>
> Not needed.
>
> > b_state = bh->b_state;
> > for (;;) {
> > b_state_old = cmpxchg(&bh->b_state, b_state, (b_state & ~BUFFER_FLAGS_DISCARD));
> > @@ -1497,7 +1499,13 @@ static void discard_buffer(struct buffer_head * bh)
> > break;
> > b_state = b_state_old;
> > }
> > - unlock_buffer(bh);
> > +
> > + /*
> > + * BUFFER_FLAGS_DISCARD include BH_lock so it has been cleared so the
> > + * wake_up_bit is the last part of a unlock_buffer
> > + */
> > + smp_mb__after_clear_bit();
>
> Similarly superfluous.
>
> > + wake_up_bit(&bh->b_state, BH_Lock);
> > }
>
> The thing is that cmpxchg() guarantees full barrier semantics before and
> after the op, and since the loop guarantees at least one cmpxchg() call
> its all good.
>

Of course, thanks for pointing that out. I was only thinking of it in
terms of it being a clear_bit operation which was dumb.

> Now just to confuse everyone, you could have written the loop like:
>
> b_state = bh->b_state;
> for (;;) {
> b_state_new = b_state & ~BUFFER_FLAGS_DISCARD;
> if (b_state == b_state_new)
> break;
> b_state = cmpxchg(&bh->b_state, b_state, b_state_new);
> }
>
> Which is 'similar' but doesn't guarantee that cmpxchg() gets called.
> If you expect the initial value to match the new state, the above form
> is slightly faster, but the lack of barrier guarantees can still spoil
> the fun.

I do not really expect the initial value to match the new state. At the
very least I would expect BH_mapped to be routinely cleared during this
operation so I doubt it's worth the effort trying to deal with
conditional buffers.

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/