Re: [PATCH] kvm/x86/vmx: report KVM_SYSTEM_EVENT_CRASH on triple fault

From: Josh Poimboeuf
Date: Tue Mar 28 2017 - 11:53:13 EST


On Tue, Mar 28, 2017 at 02:39:34PM +0200, Paolo Bonzini wrote:
>
>
> On 28/03/2017 13:46, Josh Poimboeuf wrote:
> > On Tue, Mar 28, 2017 at 03:51:01AM -0400, Paolo Bonzini wrote:
> >>
> >>> While debugging a kernel issue, I found that QEMU always reboots when an
> >>> x86 triple fault occurs, which complicates debugging. QEMU and libvirt
> >>> have a facility for creating a dump when KVM reports
> >>> KVM_SYSTEM_EVENT_CRASH. So change the VMX triple fault handler to do
> >>> that. This gives user space the ability to decide whether to dump,
> >>> pause, shutdown, or reboot.
> >>
> >> You probably want QEMU's -no-reboot option.
> >>
> >> Triple faults are already reported to userspace with KVM_EXIT_SHUTDOWN,
> >> and it's up to userspace to decide what to do with it. This patch cannot
> >> be applied, because there are guests that do a triple-fault intentionally
> >> in order to reset the machine.
> >
> > Ok. Any idea how to force libvirt to create a dump? It has a
> > 'coredump-destroy' option, but that only seems to work with 'on_crash':
> >
> > https://libvirt.org/formatdomain.html#elementsEvents
>
> Probably QEMU, when invoked with -no-shutdown -no-reboot, should treat
> KVM_EXIT_SHUTDOWN as a panic. I can have a go at it, but note that QEMU
> is now in hard freeze for the next release, so it will take a while.
>
> However you're using libvirt and it doesn't use -no-reboot.
>
> It's probably possible for libvirt to use -no-reboot more often. The
> price would be that if libvirtd crashes and a VM wants to reset, then
> the VM gets stuck.
>
> Alternatively, we could generalize -no-shutdown and -no-reboot to
> something like:
>
> -action reset=stop|restart|quit,
> poweroff=stop|quit,
> triple-fault=stop|panic|restart|quit
>
> and teach libvirt about it. The current semantics map relatively easily
> to the new option:
>
> | reset | poweroff | triple-fault
> --------------------------+-------------+------------+-------------------
> no option | restart | quit | restart
> -no-shutdown | restart | stop | restart
> -no-reboot | quit | quit | quit
> -no-shutdown -no-reboot | stop | stop | stop (panic?)

I like your new option proposal. It makes a lot more sense, at least
from the perspective of a novice user (me).

Having some kind of framework in place for dealing with triple faults --
either pausing or dumping -- would be very useful. Right now I can't
even get libvirt to pause when it happens.

--
Josh