Re: [RFC] futex: hashbucket as list of futex instead of waiters

From: Gerd Gerats
Date: Thu Sep 28 2017 - 18:40:22 EST


On Thu, Sep 28, 2017 at 12:58:14PM +0200, Peter Zijlstra wrote:
> On Sun, Sep 10, 2017 at 11:41:58PM +0200, Gerd Gerats wrote:
> > When using futex as a condition variable, for example: to manage a
> > threadpool, there may be a lot of threads inside the futex_wait to sleep on
> > this futex. The futex_hash_bucket consists therefore of many struct futex_q
> > for the same futex.
> >
> > On bad luck another futex, used as mutex, hashed into the same bucket.
> > Every futex_wake on this mutex, has to scan the whole chain of above waiter
> > to find the struct futex_q for this mutex. For non-unusual threadpool sizes
> > of more than 20, this should be a considerable effort.
> >
> > I therefore suggest to include in the hash-bucketchain only one struct
> > futex_q per futex and to queue additional waiter in an extrachain at the
> > 'top' futex_q entry. Thus different futex are isolated from each other, the
> > cost of a hash collision is reduced.
>
> So I don't dislike that idea.. however
>
> > To show the idea, I added a sample patch. Here, the plist is exchanged for
> > a futex-specific implementation. kernel/pring.h is certainly not not the
> > right place.
>
> So I suppose the purpose of that plist in futex is to enable waking up
> the highest prio waiter, but with the advent of SCHED_DEADLINE that no
> longer works.

I do not understand, plist is the original data structure. I assume, it works
together with SCHED_DEADLINE. Or is there something on the way not pressent in
4.13.

pring replaces this data structure, retaining prio sorting rules, by a
head-less ring. This ring contains all waiter for one futex_key, ordered the
same way as by plist.

Thanx, Gerd