Re: x86/asm: __clear_user() micro-optimization (was: "Re: [GIT PULL] x86/asm changes for v4.18")

From: Alexey Dobriyan
Date: Tue Jun 05 2018 - 13:22:55 EST


On Tue, Jun 05, 2018 at 05:05:14PM +0200, Ingo Molnar wrote:
>
> * Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> > On Mon, Jun 4, 2018 at 5:21 AM Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> > >
> > > - __clear_user() micro-optimization (Alexey Dobriyan)
> >
> > Was this actually tested?
>
> I'm not sure - Alexey?
>
> > I think one reason people avoided the constant was that on some
> > microarchitecture it ended up being a separate uop just for the
> > constant generation, because it wouldn't fit in a single uop.

> Ok, fair point and agreed - if Alexey sends some measurements to back the change
> I'll keep this, otherwise queue up a revert.

Tested? :^) I had P4 maybe ~15(?) years ago.

godbolt.org earliest compiler is 4.1.2 and it generates "movb [r32], imm8"
with "-m32 -O2 -march=pentium4" for simple memset-style loop
if it counts for something.

Actually I think __clear_user should be rewritten in C with assembly.
It's biggest user is probably ELF loader and those partial page .bss
clears should be noticeable.