Re: [PATCH v2 1/2] spinlock: New spinlock_refcount.h for locklessupdate of refcount
From: Linus Torvalds
Date: Sat Jun 29 2013 - 18:11:39 EST
On Sat, Jun 29, 2013 at 2:34 PM, Waiman Long <waiman.long@xxxxxx> wrote:
>
> I think I got it now. For architecture with transactional memory support to
> use an alternative implementation, we will need to use some kind of dynamic
> patching at kernel boot up time as not all CPUs in that architecture will
> have that support. In that case the helper functions have to be real
> functions and cannot be inlined. That means I need to put the implementation
> into a spinlock_refcount.c file with the header file contains structure
> definitions and function prototypes only. Is that what you are looking for?
Yes. Except even more complex: I want the generic fallbacks in a
lib/*.c files too.
So we basically have multiple "levels" of specialization:
(a) the purely lock-based model that doesn't do any optimization at
all, because we have lockdep enabled etc, so we *want* things to fall
back to real spinlocks.
(b) the generic cmpxchg approach for the case when that works
(c) the capability for an architecture to make up its own very
specialized version
and while I think in all cases the actual functions are big enough
that you don't ever want to inline them, at least in the case of (c)
it is entirely possible that the architecture actually wants a
particular layout for the spinlock and refcount, so we do want the
architecture to be able to specify the exact data structure in its own
<asm/spinlock-refcount.h> file. In fact, that may well be true of case
(b) too, as Andi already pointed out that on x86-32, an "u64" is not
necessarily sufficiently aligned for efficient cmpxchg (it may *work*,
but cacheline-crossing atomics are very very slow).
Other architectures may have other issues - even with a "generic"
cmpxchg-based library version, they may well want to specify exactly
how to take the lock. So while (a) would be 100% generic, (b) might
need small architecture-specific tweaks, and (c) would be a full
custom implementation.
See how we do <asm/word-at-a-time.h> and CONFIG_DCACHE_WORD_ACCESS.
Notice how there is a "generic" <asm-generic/word-at-a-time.h> file
(actually, big-endian only) for reference implementations (used by
sparc, m68k and parisc, for example), and then you have "full custom"
implementations for x86, powerpc, alpha and ARM.
See also lib/strnlen_user.c and CONFIG_GENERIC_STRNLEN_USER as an
example of how architectures may choose to opt in to using generic
library versions - if those work sufficiently well for that
architecture. Again, some architecture may decide to write their own
fully custome strlen_user() function.
Very similar concept.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/