Re: [PATCH] 2.6.16 - futex: small optimization (?)

From: Eric Dumazet
Date: Wed Mar 29 2006 - 10:23:51 EST


Pierre PEIFFER a écrit :
Ulrich Drepper a écrit :

There are no such situations anymore in an optimal userlevel
implementation. The last problem (in pthread_cond_signal) was fixed
by the addition of FUTEX_WAKE_OP. The userlevel code you're looking
at is simply not optimized for the modern kernels.


I think there is a misunderstanding here.


Hum... maybe Ulrich was answering to my own message (where I stated that most existing multithreaded pay the price of context switches)

(To Ulrich : Most existing applications use glibc <= 2.3.6, where FUTEX_WAKE_OP is not used yet AFAIK)

I think your analysis is correct Pierre, but you speak of 'task-switches', where there is only a spinlock involved :

On UP case : a wake_up_all() wont preempt current thread : it will task-switch only when current thread exits kernel mode.

On PREEMPT case : wake_up_all() wont preempt current thread (because current thread is holding bh->lock).

On SMP : the awaken thread will spin some time on bh->lock, but not task-switch again.

On RT kernel, this might be different of course...

FUTEX_WAKE_OP is implemented to handle simultaneously more than one futex in some specific situations (such as pthread_cond_signal).

The scenario I've described occurred in futex_wake, futex_wake_op and futex_requeue and is _independent_ of the userlevel code.

All these functions call wake_futex, and then wake_up_all, with the futex_hash_bucket lock still held.

If the woken thread is immediately scheduled (in wake_up_all), and only in this case (because of a higher priority, etc), it will try to take this lock too (because of the "if (lock_ptr != 0)" statement in unqueue_me), causing two task-switches to take this lock for nothing.

Otherwise, it will not: lock_ptr is set to NULL just after the wake_up_all call)

This scenario happens at least in pthread_cond_signal, pthread_cond_broadcast and probably all pthread_*_unlock functions.

The patch I've proposed should, at least in theory, solve this. But I'm not sure of the correctness...


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/