Re: [RFC] Improve memset
From: Linus Torvalds
Date: Tue Sep 17 2019 - 16:45:44 EST
On Tue, Sep 17, 2019 at 1:10 PM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
>
> Could it instead do this?
>
> ALTERNATIVE_2("call memset_orig",
> "call memset_rep", X86_FEATURE_REP_GOOD,
> "rep; stosb", X86_FEATURE_ERMS)
>
> Then the "reverse alternatives" feature wouldn't be needed anyway.
That sounds better, but I'm a bit nervous about the whole thing
because who knows when the alternatives code itself internally uses
memset() and then we have a nasty little chicken-and-egg problem.
Also, for it to make sense to inline rep stosb, I think we also need
to just make the calling conventions for the alternative calls be that
they _don't_ clobber other registers than the usual rep ones
(cx/di/si). Otherwise one big code generation advantage of inlining
the thing just goes away.
On the whole I get the feeling that this is all painful complexity and
we shouldn't do it. At least not without some hard performance numbers
for some huge improvement, which I don't think we've seen.
Because I find the thing fascinating conceptually, but am not at all
convinced I want to deal with the pain in practice ;)
Linus