Re: [RFC] NUMA futex hashing
From: Eric Dumazet
Date: Tue Aug 08 2006 - 13:06:04 EST
On Tuesday 08 August 2006 18:58, Ulrich Drepper wrote:
> On 8/8/06, Eric Dumazet <dada1@xxxxxxxxxxxxx> wrote:
> > So we really can... but for 'private futexes' which are the vast majority
> > of futexes needed by typical program (using POSIX pshared thread mutex
> > attribute PTHREAD_PROCESS_PRIVATE, currently not used by NPTL glibc)
> Nonsense. Mutexes are by default always private. They explicitly
> have to be marked as sharable. This happens using the
> pthread_mutexattr_setpshared function which takes
> PTHREAD_PROCESS_PRIVATE or PTHREAD_PROCESS_SHARED in the second
> parameter. So the former _is_ clearly used.
I was saying that PTHREAD_PROCESS_PRIVATE or PTHREAD_PROCESS_SHARED info is
not provided to the kernel (because futex api/implementation dont need to).
It was not an attack on glibc.
> > Of course we would need a new syscall, and to change glibc to be able to
> > actually use this new private_futex syscall.
> No, why? The kernel already does recognize private mutexes. It just
> checks whether the pages used to store it are private or mapped. This
> requires some interaction with the memory subsystem but as long as no
> crashes happen the data can change underneath. It's the program's
> fault if it does.
But if you let futex code doing the vma walk to check the private/shared
status, you still need the mmap_sem locking.
Moreover, a program can mmap() a file (shared in terms of VMA), and continue
to use a PTHREAD_PROCESS_PRIVATE mutex lying in this shared zone
(Example : shmem or hugetlb mapping, wich API might always give a 'shared'
> On the waker side you would search the local futex hash table/tree
> first and if this doesn't yield a match, search the global table.
> Wakeup calls without any waiters are usually rare.
If the two searches touch two different cache lines in the hash table, we
might have a performance regression.
Of course we might chose a hash function so that the same slot is accessed.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/