Re: [PATCH v9 09/17] x86/split_lock: Handle #AC exception for split lock

From: Thomas Gleixner
Date: Wed Oct 16 2019 - 07:49:44 EST


On Wed, 16 Oct 2019, Paolo Bonzini wrote:
> On 16/10/19 11:47, Thomas Gleixner wrote:
> > On Wed, 16 Oct 2019, Paolo Bonzini wrote:
> >> Just never advertise split-lock
> >> detection to guests. If the host has enabled split-lock detection,
> >> trap #AC and forward it to the host handler---which would disable
> >> split lock detection globally and reenter the guest.
> >
> > Which completely defeats the purpose.
>
> Yes it does. But Sean's proposal, as I understand it, leads to the
> guest receiving #AC when it wasn't expecting one. So for an old guest,
> as soon as the guest kernel happens to do a split lock, it gets an
> unexpected #AC and crashes and burns. And then, after much googling and
> gnashing of teeth, people proceed to disable split lock detection.

I don't think that this was what he suggested/intended.

> In all of these cases, the common final result is that split-lock
> detection is disabled on the host. So might as well go with the
> simplest one and not pretend to virtualize something that (without core
> scheduling) is obviously not virtualizable.

You are completely ignoring any argument here and just leave it behind your
signature (instead of trimming your reply).

> > 1) Sane guest
> >
> > Guest kernel has #AC handler and you basically prevent it from
> > detecting malicious user space and killing it. You also prevent #AC
> > detection in the guest kernel which limits debugability.

That's a perfectly fine situation. Host has #AC enabled and exposes the
availability of #AC to the guest. Guest kernel has a proper handler and
does the right thing. So the host _CAN_ forward #AC to the guest and let it
deal with it. For that to work you need to expose the MSR so you know the
guest state in the host.

Your lazy 'solution' just renders #AC completely useless even for
debugging.

> > 2) Malicious guest
> >
> > Trigger #AC to disable the host detection and then carry out the DoS
> > attack.

With your proposal you render #AC useless even on hosts which have SMT
disabled, which is just wrong. There are enough good reasons to disable
SMT.

I agree that with SMT enabled the situation is truly bad, but we surely can
be smarter than just disabling it globally unconditionally and forever.

Plus we want a knob which treats guests triggering #AC in the same way as
we treat user space, i.e. kill them with SIGBUS.

Thanks,

tglx