Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit

From: Boris Ostrovsky
Date: Mon Nov 16 2015 - 11:25:37 EST

On 11/15/2015 01:02 PM, Andy Lutomirski wrote:
On Nov 13, 2015 5:23 PM, "Boris Ostrovsky" <boris.ostrovsky@xxxxxxxxxx> wrote:

On 11/13/2015 06:26 PM, Andy Lutomirski wrote:
On Fri, Nov 13, 2015 at 3:18 PM, Boris Ostrovsky
<boris.ostrovsky@xxxxxxxxxx> wrote:
After 32-bit syscall rewrite, and specifically after commit 5f310f739b4c
("x86/entry/32: Re-implement SYSENTER using the new C path"), the stack
frame that is passed to xen_sysexit is no longer a "standard" one (i.e.
it's not pt_regs).

We need to adjust it so that subsequent xen_iret can use it.
I'm wondering if this should be more straightforward:

movq %rsp, %rdi
call do_fast_syscall_32
testl %eax, %eax
jz .Lsyscall_32_done

/* Opportunistic SYSRET */

where XEN_DO_SYSRET32 is a simple pv op that, on Xen, jumps to a
variant of Xen's iret path that knows that the fast path is okay.

This patch is for 32-bit kernel. I actually haven't looked at compat code (probably because our tests don't try that), I need to do that too.
In 4.4, it's almost identical (which was part of the point of this
whole series). We use sysret32 instead of sysexit, but the underlying
structure is the same: munge the stack frame and register state
appropriately to use the fast return instruction in question and then
execute it. In both cases, the only real difference from the IRET
path is that we're willing to lose the values of some subset of cx,
dx, and (on 64-bit kernels) r11.

So it turned out that for compat mode we don't need to do anything since xen_sysret32 doesn't assume any stack format (or, rather, it assumes that it can't be used) and builds the IRET frame itself.

As for XEN_DO_SYSRET32 --- we'd presumably need to have a nop for baremetal otherwise current paravirt op will use native_usergs_sysret32 (for compat code). Which means a new pv_op, I think.
Agreed, unless...

Does Xen have a cpufeature? Using ALTERNATIVE instead of a pvop could
be easier to follow and be less code at the same time. Frankly,
following the control flow from asm through the pre-paravirt-patching
and post-paravirt-patching variants and into the final targets is
getting a little bit old, and ALTERNATIVE is crystal clear in
comparison (and has all the interesting info inline with the rest of
the asm). Of course, it doesn't work early in boot, but that's fine
for anything involving user/kernel switches.

We don't currently have a Xen-specific CPU feature. We could, in principle, add it but we can't replace all of current paravirt patching with a single feature since PVH guests use a subset of existing pv ops (and in the future it may become even more fine-grained).

And I don't think we should go ALTERNATIVE route for one set of features and keep pv ops for the rest --- it should be either one or the other.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at