Re: [PATCH 0/2] x86/fred: Fix two problems during the FRED initialization

From: Hou Wenlong
Date: Mon Jun 24 2024 - 02:27:15 EST


On Sat, Jun 22, 2024 at 08:31:26AM +0800, Xin Li wrote:
> On 6/21/2024 6:12 AM, Hou Wenlong wrote:
> >When I reviewed the FRED code and attempted to implement a FRED-like
> >event delivery for my PV guest, I encountered two problems which I may
> >have misunderstood.
>
> Hi Wenlong,
>
> Thanks for bringing the issues up.
>
Thanks for your kind reply.

> >
> >One issue is that FRED can be disabled in trap_init(), but
> >sysvec_install() can be called before trap_init(), thus the system
> >interrupt handler is not installed into the IDT if FRED is disabled
> >later. Initially, I attempted to parse the cmdline and decide whether to
> >enable or disable FRED after parse_early_param(). However, I ultimately
> >chose to always install the system handler into the IDT in
> >sysvec_install(), which is simple and should be sufficient.
>
> Which module with a system vector gets initialized before trap_init() is
> called?
>
Sorry, I didn't mention the case here. I see sysvec_install() is used
only in the guest part (KVM, HYPERV) currently. For example, the KVM
guest will register the HYPERVISOR_CALLBACK_VECTOR APF handler in
kvm_guest_init(), which is called before trap_init(). So if only the FRED
handler is registered and FRED is disabled in trap_init() later, then the
IDT handler of the APF handler is missing.

> >Another problem is that the page fault handler (exc_page_fault()) is
> >installed into the IDT before FRED is enabled. Consequently, if a #PF is
> >triggered in this gap, the handler would receive the wrong CR2 from the
> >stack if FRED feature is present. To address this, I added a page fault
> >entry stub for FRED similar to the debug entry. However, I'm uncertain
> >whether this is enough reason to add a new entry. Perhaps a static key
> >may suffice to indicate whether FRED setup is completed and the handler
> >can use it.
>
> How could a #PF get triggered during that gap?
>
> Initialization time funnies are really unpleasant.
>
I'm not sure if there will be a #PF during that gap; I just received the
wrong fault address when I made a mistake in that gap and a #PF
occurred. Before idt_setup_early_pf(), the registered page fault handler
is do_early_exception(), which uses native_read_cr2(). However, after
that, the page fault handler had been changed to exc_page_fault(), so if
something bad happened and an unexpected #PF occurred, the fault address
in the error output will be wrong, although the CR2 in __show_regs() is
correct. I'm not sure if this matters or not since the kernel will panic
at that time.

Thanks!
> Thanks!
> Xin