Re: [PATCH 1/2] brw_mutex: big read-write mutex

From: Peter Zijlstra
Date: Fri Oct 19 2012 - 08:38:23 EST

On Thu, 2012-10-18 at 15:28 -0400, Mikulas Patocka wrote:
> On Thu, 18 Oct 2012, Oleg Nesterov wrote:
> > Ooooh. And I just noticed include/linux/percpu-rwsem.h which does
> > something similar. Certainly it was not in my tree when I started
> > this patch... percpu_down_write() doesn't allow multiple writers,
> > but the main problem it uses msleep(1). It should not, I think.
> synchronize_rcu() can sleep for hundred milliseconds, so msleep(1) is not
> a big problem.

That code is beyond ugly though.. it should really not have been merged.

There's absolutely no reason for it to use RCU except to make it more
complicated. And as Oleg pointed out that msleep() is very ill

The very worst part of it seems to be that nobody who's usually involved
with locking primitives was ever consulted (Linus, PaulMck, Oleg, Ingo,
tglx, dhowells and me). It doesn't even have lockdep annotations :/

So the only reason you appear to use RCU is because you don't actually
have a sane way to wait for count==0. And I'm contesting rcu_sync() is
sane here -- for the very simple reason you still need while (count)
loop right after it.

So it appears you want an totally reader biased, sleepable rw-lock like

So did you consider keeping the inc/dec on the same per-cpu variable?
Yes this adds a potential remote access to dec and requires you to use
atomics, but I would not be surprised if the inc/dec were mostly on the
same cpu most of the times -- which might be plenty fast for what you

If you've got coherent per-cpu counts, you can better do the
waitqueue/wake condition for write_down.

It might also make sense to do away with the mutex, there's no point in
serializing the wakeups in the p->locked case of down_read. Furthermore,
p->locked seems a complete duplicate of the mutex state, so removing the
mutex also removes that duplication.

Also, that CONFIG_x86 thing.. *shudder*...
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at