Re: [PATCH 09/31] x86/entry/32: Leave the kernel via trampoline stack

From: Denys Vlasenko
Date: Fri Feb 09 2018 - 14:31:06 EST

On 02/09/2018 08:02 PM, Joerg Roedel wrote:
On Fri, Feb 09, 2018 at 09:05:02AM -0800, Linus Torvalds wrote:
On Fri, Feb 9, 2018 at 1:25 AM, Joerg Roedel <joro@xxxxxxxxxx> wrote:
+ /* Copy over the stack-frame */
+ cld
+ rep movsb

Ugh. This is going to be horrendous. Maybe not noticeable on modern
CPU's, but the whole 32-bit code is kind of pointless on a modern CPU.

At least use "rep movsl". If the kernel stack isn't 4-byte aligned,
you have issues.

Okay, I used movsb because I remembered that being the recommendation
for the most efficient memcpy, and it safes me an instruction. But that
is probably only true on modern CPUs.

It's fast (copies data with full-width loads and stores,
up to 64-byte wide on latest Intel CPUs), but this kicks in only for
largish blocks. In your case, you are copying less than 100 bytes.