Re: [PATCH 01/15] mm: poison struct page for ptlock

From: Ingo Molnar
Date: Thu Nov 10 2005 - 07:05:58 EST



* Andrew Morton <akpm@xxxxxxxx> wrote:

> > > But I think either a big patch or 2.95.x abandonment is preferable to this
> > > approach.
> >
> > Hmm, that's a pity.
>
> Well plan B is to kill spinlock_t.break_lock. That fixes everything
> and has obvious beneficial side-effects.

i'd really, really love to have this solved without restricting the type
flexibility of spinlocks.

do we really need 2.95.x support? gcc now produces smaller code with -S.

> a) x86 spinlock_t is but one byte. Can we stuff break_lock into the
> same word?
>
> (Yes, there's also a >128 CPUs spinlock overflow problem to solve,
> but perhaps we can use lock;addw?)

the >128 CPUs spinlock overflow problem is solved by going to 32-bit ops
(patch has been posted to lkml recently). 16-bit ops are out of
question. While byte ops are frequently optimized for (avoiding partial
register access related stalls), the same is not true for 16-bit
prefixed instructions! Mixing 32-bit with 16-bit code is going to suck
on a good number of x86 CPUs. It also bloats the instruction size of
spinlocks, due to the 16-bit prefix. (while byte access has its own
opcode)

also, there are arches where the only atomic op available is a 32-bit
one. So trying to squeeze the break_lock flag into the spinlock field is
i think unrobust.

> b) Redesign the code somehow. Currently break_lock means "there's
> someone waiting for this lock".
>
> But if we were to leave the lock in a decremented state while
> spinning (as we've always done), that info is still present via the
> value of spinlock_t.slock. Hence: bye-bye break_lock.

this wont work on arches that dont have atomic-decrement based
spinlocks. (some arches dont even have such an instruction) This means
those arches would have to implement a "I'm spinning" flag in the word,
which might or might not work (if the arch doesnt have an atomic
instruction that works on the owner bit only then it becomes impossible)
- but in any case it would need very fragile per-arch assembly work to
pull off.

> c) Make the break_lock stuff a new config option,
> CONFIG_SUPER_LOW_LATENCY_BLOATS_STRUCT_PAGE.
>
> d) Revert it wholesale, have sucky SMP worst-case latency ;)

yuck. What is the real problem btw? AFAICS there's enough space for a
2-word spinlock in struct page for pagetables. We really dont want to
rewrite spinlocks (or remove features) just to keep gcc 2.95 supported
for some more time. In fact, is there any 2.6 based distro that uses gcc
2.95?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/