Re: x86_64 INIT/SIPI Bug

From: Rian Quinn
Date: Fri Nov 09 2018 - 13:05:14 EST


>> I apologize upfront if this is the wrong place to post this, pretty new to this.
>>
>> We are working on the Bareflank Hypervisor (www.bareflank.org), and we
>> are passing through the INIT/SIPI process (similar to how a VMX
>> rootkit from EFI might boot the OS) and we noticed that on Arch Linux,
>> the INIT/SIPI process stalls, something we are not seeing on Ubuntu.
>
> I'm confused, INIT is blocked post-VMXON, what are you passing through?

You are correct that INIT will track unconditionally, but all we do is set the
activity state to wait-for-sipi and return back, allowing Linux to continue
its boot process. The code can be seen here:
https://github.com/rianquinn/extended_apis/blob/hyperkernel_1/bfvmm/src/hve/arch/intel_x64/vmexit/init_signal.cpp

It should be noted that this works great for Linux and Windows, allowing us
to boot pretty much any OS that we want with the hypervisor running
(kind of like a VMX rootkit as no traps are really occurring except CPUID
once the OS is loaded). We are doing something very similar to Intel's KGT,
and Xen's PVH dom0.

The problem is, as we started working this we noticed that Ubuntu was booting
fine, but Arch wasn't and it turns out that Arch must be compiling the kernel
with this optimization enabled. Once it is enabled, the kernel basically
sends two SIPI commands before the AP has a chance to trap INIT, set
the activity
state, and then reenter, which causes both SIPIs to get dropped by hardware
. In other words, since Linux is not waiting to send the first SIPI
like the manual states, the SIPIs are lost if a hypervisor is enabled, even
if the hypervisor is doing the least possible amount of code (just setting the
activity state and returning). The working solution is either us a Linux
distribution that disables this optimization like Ubuntu, or to provide the
Linux kernel with the boot param to tell it to add the delay.

To root of the issue is the quirk is assuming that the CPU can handle the
INIT/SIPI/SIPI without a delay, but this assumption doesn't hold if the INIT
first has to trap to a hypervisor (regardless of the hypervisor). In this case,
a delay is still needed.
On Fri, Nov 9, 2018 at 10:49 AM Sean Christopherson
<sean.j.christopherson@xxxxxxxxx> wrote:
>
> On Thu, Nov 08, 2018 at 03:23:59PM -0700, Rian Quinn wrote:
> > I apologize upfront if this is the wrong place to post this, pretty new to this.
> >
> > We are working on the Bareflank Hypervisor (www.bareflank.org), and we
> > are passing through the INIT/SIPI process (similar to how a VMX
> > rootkit from EFI might boot the OS) and we noticed that on Arch Linux,
> > the INIT/SIPI process stalls, something we are not seeing on Ubuntu.
>
> I'm confused, INIT is blocked post-VMXON, what are you passing through?