Re: [patch] 2.4.4 alpha semaphores optimization

From: Andrea Arcangeli (andrea@suse.de)
Date: Thu May 03 2001 - 12:28:48 EST


On Thu, May 03, 2001 at 07:47:47PM +0400, Ivan Kokshaysky wrote:
> Initially I tried to use __builtin_expect in the rwsem.h, but found
> that it doesn't help at all in the small inline functions - it works
> as expected only in a reasonably large block of code. Converting these
> functions into the macros won't help, as callers are inline
> functions also.
> On the other hand, gcc 3.0 generates quite a good code for
> conditional branches (comparisons like value < 0, value == 0
> predicted as false etc.). In the cases where expected value is 0,
> we can use cmpeq instruction.
> Other changes:
> - added atomic_add_return_prev() for __down_write()
> - removed some mb's for non-SMP
> - removed non-inline up()/down_xx() when semaphore/waitqueue debugging
> isn't enabled.

I'd love if you could port it on top of this one and to fix it so that
it can handle up to 2^32 sleepers and not only 2^16 like we have to do
on the 32bit archs to get good performance:

        ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.4aa3/00_rwsem-11

I just wrote the prototype, it only needs to be implemented see
linux/include/asm-alpha/rwsem_xchgadd.h:

--------------------------------------------------------------
#ifndef _ALPHA_RWSEM_XCHGADD_H
#define _ALPHA_RWSEM_XCHGADD_H

/* WRITEME */

static inline void __down_read(struct rw_semaphore *sem)
{
}

static inline void __down_write(struct rw_semaphore *sem)
{
}

static inline void __up_read(struct rw_semaphore *sem)
{
}

static inline void __up_write(struct rw_semaphore *sem)
{
}

static inline long rwsem_xchgadd(long value, long * count)
{
        return value;
}

#endif
--------------------------------------------------------------

You only need to fill the above 5 inlined fast paths to make it working
and that's the only thing in the whole alpha tree about the rwsem.

The above patch also provides the fastest write fast path for x86 archs
and the fastest rwsem spinlock based. I didn't yet re-benchmarked the
whole thing yet but still my up_write definitely has to be faster than
the one in 2.4.4 vanilla and the other fast paths have to be the same
speed.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon May 07 2001 - 21:00:17 EST