Re: Simplfying copy_siginfo_to_user
From: Eric W. Biederman
Date: Mon Jul 24 2017 - 15:12:49 EST
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:
> On Sat, Jul 22, 2017 at 1:25 PM, Eric W. Biederman
> <ebiederm@xxxxxxxxxxxx> wrote:
>> I played with some clever changes such as limiting the copy to 48 bytes,
>> disabling the memset and the like but I could not get a strong enough
>> signal to say that any one change removed the extra or a clear part of
>> it 20ns.
>
> What CPU did you use? Because the SMAP bit in particular matters.
>
> The field-by-field copies are extremely slow on modern CPU's that
> implement SMAP, unless you also use the special "unsafe_put_user()"
> code (or the nasty old put_user_ex() code that some of the x86 signal
> code uses).
>
> So one of the advantages of just copy_to_user() ends up being visible
> only on Broadwell+ (or whatever the SMAP cutoff is).
Good point.
The cpu I was testing on was an AMD A10. I don't actually have a cpu
that supports SMAP handy.
If you would like I can post the minimal patches and benckmark so anyone
who is interested could reproduce this for themselves.
I suspect that if it is down to only 20ns without SMAP this will
definitely be a performance improvement in the presence of SMAP.
Eric