Re: [PATCH 1/2] brw_mutex: big read-write mutex

From: Peter Zijlstra
Date: Fri Oct 19 2012 - 13:41:28 EST

On Fri, 2012-10-19 at 11:32 -0400, Mikulas Patocka wrote:

> So if you can do an alternative implementation without RCU, show it.

Uhm,,. no that's not how it works. You just don't push through crap like
this and then demand someone else does it better.

But using preempt_{disable,enable} and using synchronize_sched() would
be better (for PREEMPT_RCU) although it wouldn't fix anything

> The
> goal is - there should be no LOCK instructions on the read path and as
> few barriers as possible.

Fine goal, although somewhat arch specific. Also note that there's a
relation between atomics and memory barriers, one isn't necessarily
worse than the other, they all require synchronization of sorts.

> > So did you consider keeping the inc/dec on the same per-cpu variable?
> > Yes this adds a potential remote access to dec and requires you to use
> > atomics, but I would not be surprised if the inc/dec were mostly on the
> > same cpu most of the times -- which might be plenty fast for what you
> > want.
> Yes, I tried this approach - it involves doing LOCK instruction on read
> lock, remembering the cpu and doing another LOCK instruction on read
> unlock (which will hopefully be on the same CPU, so no cacheline bouncing
> happens in the common case). It was slower than the approach without any
> LOCK instructions (43.3 seconds seconds for the implementation with
> per-cpu LOCKed access, 42.7 seconds for this implementation without atomic
> instruction; the benchmark involved doing 512-byte direct-io reads and
> writes on a ramdisk with 8 processes on 8-core machine).

So why is that a problem? Surely that's already tons better then what
you've currently got. Also uncontended LOCK is something all x86 vendors
keep optimizing, they'll have to if they want to keep adding CPUs.

> > If you've got coherent per-cpu counts, you can better do the
> > waitqueue/wake condition for write_down.
> synchronize_rcu() is way slower than msleep(1) - so I don't see a reason
> why should it be complicated to avoid msleep(1).

Its not about slow, a polling write side is just fscking ugly. Also, if
you're already polling that *_sync() is bloody pointless.

> > It might also make sense to do away with the mutex, there's no point in
> > serializing the wakeups in the p->locked case of down_read.
> The advantage of a mutex is that it is already protected against
> starvation. If I replace the mutex with a wait queue and retry, there is
> no starvation protection.

Which starvation? writer-writer order? What stops you from adding a list
there yourself? Also, writers had better be rare for this thing, so who
gives a crap?

> > Furthermore,
> > p->locked seems a complete duplicate of the mutex state, so removing the
> > mutex also removes that duplication.
> We could replace if (p->locked) with if (mutex_is_locked(p->mtx))

Quite so..

You're also still lacking lockdep annotations...
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at