Re: [PATCH] x86: Optimize variable_test_bit()
From: Linus Torvalds
Date: Fri May 01 2015 - 12:03:40 EST
On Fri, May 1, 2015 at 8:16 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> Since test_bit() doesn't actually have any output variables, we can use
> asm goto without having to add a memory clobber. This reduces the code
> to something sensible:
Yes, looks good, except if we have anything that actually wants to use
the value rather than branch on it. But a quick grep seems to show
that the vast majority of them are all about just directly testing the
result.
It worries me a bit that gcc now cannot pick the likely branch any
more. It will always branch out for the bit being set. So code like
this:
net/core/dev.c: if
(likely(!test_bit(__QDISC_STATE_DEACTIVATED, &q->state)))
wouldn't work, but almost all of those seem to be the constant case
that doesn't get to this anyway.
So on the whole, this would seem to be a win.
> PS. should we kill the memory clobber for __test_and_change_bit()? It
> seems inconsistent and out of place.
We don't seem to have it for the clear case, so yeah..
> PPS. Jakub, I see gcc5.1 still hasn't got output operands for asm goto;
> is this something we can get 'fixed' ?
I suspect the problem is that now the particular register allocation
choices are basically not just around the asm, they'd affect the
target labels of the asm too.
I think that for the kernel, it would *generally* be ok to just say
that the outputs are only valid in the case the asm does *not* branch
out, assuming that the *clobbers* obviously clobber things regardless.
Keeping the register allocation for the asm itself still purely
"local" to the asm.
Something with a memory output we could just turn into a memory
clobber (so we could do the test-and-change bits today without using
any outputs - just mark memory clobbered).
Don't get me wrong - it would be even better if outputs would be valid
even for the labels the asm can jump to, but I can see that being
fundamentally hard. For example, two different asm goto's that share
one label, but that have different output register allocation. That
would be a nightmare (the compiler could basically have to duplicate
the label etc - although maybe you have to do that anyway).
And many of the places where I would personally like to use "asm goto"
are where the goto label is for an error case. Things like a failed
cmpxchg, or a failed user access etc. It would generally be ok if the
output values from the asm were "lost", because it's about the cleanup
(or trying again)..
So we might be ok in many cases with that kind of "weaker output",
where the output is only valid when the goto is *not* taken.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/