Re: [PATCH][RFC]: mutex: adaptive spin

From: Chris Mason
Date: Tue Jan 06 2009 - 11:21:30 EST


On Tue, 2009-01-06 at 07:55 -0800, Linus Torvalds wrote:
>
> On Tue, 6 Jan 2009, Ingo Molnar wrote:
> >
> > The thing i like most about Peter's patch (compared to most other adaptive
> > spinning approaches i've seen, which all sucked as they included various
> > ugly heuristics complicating the whole thing) is that it solves the "how
> > long should we spin" question elegantly: we spin until the owner runs on a
> > CPU.
>
> The other way around, you mean: we spin until the owner is no longer
> holding a cpu.
>
> I agree that it's better than the normal "spin for some random time"
> model, but I can't say I like the "return 0" cases where it just retries
> the whole loop if the semaphore was gotten by somebody else instead.
> Sounds like an easyish live-lock to me.
>
> I also still strongly suspect that whatever lock actually needs this,
> should be seriously re-thought.
>
> But apart from the "return 0" craziness I at least dont' _hate_ this
> patch. Do we have numbers? Do we know which locks this matters on?

This discussion was kicked off by an unconditional spin (512 tries)
against mutex_trylock in the btrfs tree locking code.

Btrfs is using mutexes to protect the btree blocks, and btree searching
often hits hot nodes that are always in cache. For these nodes, the
spinning is much faster, but btrfs also needs to be able to sleep with
the locks held so it can read from the disk and do other complex
operations.

For btrfs, dbench 50 performance doubles with the unconditional spin,
mostly because that workload is almost all in ram.

For 50 procs creating 4k files in parallel, the spin is 30-50% faster.
This workload is a mixture of disk bound and CPU bound.

Yes, there is definitely some low hanging fruit to tune the btrfs btree
searches and locking. But, I think the adaptive model is a good fit for
on disk btrees in general.

-chris


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/