Re: [patch V6 10/37] x86/entry: Switch XEN/PV hypercall entry to IDTENTRY

From: Thomas Gleixner
Date: Tue May 19 2020 - 14:58:59 EST


Andy Lutomirski <luto@xxxxxxxxxx> writes:
> On Fri, May 15, 2020 at 5:10 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>> @@ -573,6 +578,16 @@ static __always_inline void __idtentry_exit(struct pt_regs *regs)
>> instrumentation_end();
>> return;
>> }
>> + } else if (IS_ENABLED(CONFIG_XEN_PV)) {
>> + if (preempt_hcall) {
>> + /* See CONFIG_PREEMPTION above */
>> + instrumentation_begin();
>> + rcu_irq_exit_preempt();
>> + xen_maybe_preempt_hcall();
>> + trace_hardirqs_on();
>> + instrumentation_end();
>> + return;
>> + }
>
> Ewwwww! This shouldn't be taken as a NAK -- it's just an expression
> of disgust.

I'm really not proud of it, but that was the least horrible thing I
could come up with.

> Shouldn't this be:
>
> instrumentation_begin();
> if (!irq_needs_irq_stack(...))
> __blah();
> else
> run_on_irqstack(__blah, NULL);
> instrumentation_end();
>
> or even:
>
> instrumentation_begin();
> run_on_irqstack_if_needed(__blah, NULL);
> instrumentation_end();

Yeah. In that case the instrumentation markers are not required as they
will be inside the run....() function.

> ****** BUT *******
>
> I think this is all arse-backwards. This is a giant mess designed to
> pretend we support preemption and to emulate normal preemption in a
> non-preemptible kernel. I propose one to two massive cleanups:
>
> A: Just delete all of this code. Preemptible hypercalls on
> non-preempt kernels will still process interrupts but won't get
> preempted. If you want preemption, compile with preemption.

I'm happy to do so, but the XEN folks might have opinions on that :)

> B: Turn this thing around. Specifically, in the one and only case we
> care about, we know pretty much exactly what context we got this entry
> in: we're running in a schedulable context doing an explicitly
> preemptible hypercall, and we have RIP pointing at a SYSCALL
> instruction (presumably, but we shouldn't bet on it) in the hypercall
> page. Ideally we would change the Xen PV ABI so the hypercall would
> return something like EAGAIN instead of auto-restarting and we could
> ditch this mess entirely. But the ABI seems to be set in stone or at
> least in molasses, so how about just:
>
> idt_entry(exit(regs));
> if (inhcall && need_resched())
> schedule();

Which brings you into the situation that you call schedule() from the
point where we just moved it out. If we would go there we'd need to
ensure that RCU is watching as well. idtentry_exit() might have it
turned off ....

That's why I did it this way to keep the code flow exactly the same for
all these exit variants.

Thanks,

tglx