Re: gcc feature request / RFC: extra clobbered regs
From: Andy Lutomirski
Date: Wed Jul 01 2015 - 11:28:04 EST
On Wed, Jul 1, 2015 at 8:23 AM, Vladimir Makarov <vmakarov@xxxxxxxxxx> wrote:
>
>
> On 06/30/2015 05:37 PM, Jakub Jelinek wrote:
>>
>> On Tue, Jun 30, 2015 at 02:22:33PM -0700, Andy Lutomirski wrote:
>>>
>>> I'm working on a massive set of cleanups to Linux's syscall handling.
>>> We currently have a nasty optimization in which we don't save rbx,
>>> rbp, r12, r13, r14, and r15 on x86_64 before calling C functions.
>>> This works, but it makes the code a huge mess. I'd rather save all
>>> regs in asm and then call C code.
>>>
>>> Unfortunately, this will add five cycles (on SNB) to one of the
>>> hottest paths in the kernel. To counteract it, I have a gcc feature
>>> request that might not be all that crazy. When writing C functions
>>> intended to be called from asm, what if we could do:
>>>
>>> __attribute__((extra_clobber("rbx", "rbp", "r12", "r13", "r14",
>>> "r15"))) void func(void);
>>>
>>> This will save enough pushes and pops that it could easily give us our
>>> five cycles back and then some. It's also easy to be compatible with
>>> old GCC versions -- we could just omit the attribute, since preserving
>>> a register is always safe.
>>>
>>> Thoughts? Is this totally crazy? Is it easy to implement?
>>>
>>> (I'm not necessarily suggesting that we do this for the syscall bodies
>>> themselves. I want to do it for the entry and exit helpers, so we'd
>>> still lose the five cycles in the full fast-path case, but we'd do
>>> better in the slower paths, and the slower paths are becoming
>>> increasingly important in real workloads.)
>>
>> GCC already supports -ffixed-REG, -fcall-used-REG and -fcall-saved-REG
>> options, which allow to tweak the calling conventions; but it is per
>> translation unit right now. It isn't clear which of these options
>> you mean with the extra_clobber.
>> I assume you are looking for a possibility to change this to be
>> per-function, with caller with a different calling convention having to
>> adjust for different ABI callee. To some extent, recent GCC versions
>> do that automatically with -fipa-ra already - if some call used registers
>> are not clobbered by some call and the caller can analyze that callee,
>> it can stick values in such registers across the call.
>> I'd say the most natural API for this would be to allow
>> f{fixed,call-{used,saved}}-REG in target attribute.
>>
>>
> One consequence of frequent changing calling convention per function or
> register usage could be GCC slowdown. RA calculates too many data and it
> requires a lot of time to recalculate them after something in the register
> usage convention is changed.
Do you mean that RA precalculates things based on the calling
convention and saves it across functions? Hmm. I don't think this
would be a big problem in my intended use case -- there would only be
a handful of functions using this extension, and they'd have very few
non-asm callers.
>
> Another consequence would be that RA fails generate the code in some cases
> and even worse the failure might depend on version of GCC (I already saw PRs
> where RA worked for an asm in one GCC version because a pseudo was changed
> by equivalent constant and failed in another GCC version where it did not
> happen).
>
Would this be a problem generating code for a function with extra
"used" regs or just a problem generating code to call such a function.
I imagine that, in the former case, RA's job would be easier, not
harder, since there would be more registers to work with. In
practice, though, I think it would just end up changing the prologue
and epilogue.
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/