From: Oleg Nesterov
Date: Tue Oct 23 2012 - 15:22:11 EST

On 10/23, Oleg Nesterov wrote:
> Not really the comment, but the question...

Damn. And another question.

Mikulas, I am sorry for this (almost) off-topic noise. Let me repeat
just in case that I am not arguing with your patches.

So write_lock/write_unlock needs to call synchronize_sched() 3 times.
I am wondering if it makes any sense to try to make it a bit heavier
but faster.

What if we change the reader to use local_irq_disable/enable around
this_cpu_inc/dec (instead of rcu read lock)? I have to admit, I have
no idea how much cli/sti is slower compared to preempt_disable/enable.

Then the writer can use

static void mb_ipi(void *arg)
smp_mb(); /* unneeded ? */

static void force_mb_on_each_cpu(void)
smp_call_function(mb_ipi, NULL, 1);

to a) synchronise with irq_disable and b) to insert the necessary mb's.

Of course smp_call_function() means more work for each CPU, but
write_lock() should be rare...

This can also wakeup the idle CPU's, but probably we can do
on_each_cpu_cond(cond_func => !idle_cpu). Perhaps cond_func() can
also return false if rcu_user_enter() was called...

Actually I was thinking about this from the very beginning, but I do
not feel this looks like a good idea. Still I'd like to ask what do
you think.


