Re: discuss about pvpanic

From: Michal Privoznik
Date: Wed Jan 08 2020 - 04:58:27 EST


On 1/8/20 10:36 AM, Paolo Bonzini wrote:
On 08/01/20 09:25, zhenwei pi wrote:
Hey, Paolo

Currently, pvpapic only supports bit 0(PVPANIC_PANICKED).
We usually expect that guest writes ioport (typical 0x505) in panic_notifier_list callback
during handling panic, then we can handle pvpapic event PVPANIC_PANICKED in QEMU.

On the other hand, guest wants to handle the crash by kdump-tools, and reboots without any
panic_notifier_list callback. So QEMU only knows that guest has rebooted (because guest
write 0xcf9 ioport for RCR request), but QEMU can't identify why guest resets.

In production environment, we hit about 100+ guest reboot event everyday, sadly we
can't separate the abnormal reboot from normal operation.

We want to add a new bit for pvpanic event(maybe PVPANIC_CRASHLOADED) to represent the guest has crashed,
and the panic is handled by the guest kernel. (here is the previous patch https://lkml.org/lkml/2019/12/14/265)

What do you think about this solution? Or do you have any other suggestions?

Hi Zhenwei,

the kernel-side patch certainly makes sense. I assume that you want the
event to propagate up from QEMU to Libvirt and so on? The QEMU patch
would need to declare a new event (qapi/misc.json) and send it in
handle_event (hw/misc/pvpanic.c). For Libvirt I'm not familiar, so I'm
adding the respective list.

Adding an event is fairly easy, if everything you want libvirt to do is report the event to upper layers. I volunteer to do it. Question is, how qemu is going to report this, whether some attributes to GUEST_PANICKED event or some new event. But more important is to merge the change into kernel.

Michal