Re: [PATCH 0/2] x86/fred: Fix two problems during the FRED initialization

From: Xin Li
Date: Tue Jun 25 2024 - 05:11:24 EST


On 6/23/2024 11:21 PM, Hou Wenlong wrote:
On Sat, Jun 22, 2024 at 08:31:26AM +0800, Xin Li wrote:
On 6/21/2024 6:12 AM, Hou Wenlong wrote:
One issue is that FRED can be disabled in trap_init(), but
sysvec_install() can be called before trap_init(), thus the system
interrupt handler is not installed into the IDT if FRED is disabled
later. Initially, I attempted to parse the cmdline and decide whether to
enable or disable FRED after parse_early_param(). However, I ultimately
chose to always install the system handler into the IDT in
sysvec_install(), which is simple and should be sufficient.

Which module with a system vector gets initialized before trap_init() is
called?

Sorry, I didn't mention the case here. I see sysvec_install() is used
only in the guest part (KVM, HYPERV) currently. For example, the KVM
guest will register the HYPERVISOR_CALLBACK_VECTOR APF handler in
kvm_guest_init(), which is called before trap_init(). So if only the FRED
handler is registered and FRED is disabled in trap_init() later, then the
IDT handler of the APF handler is missing.

This is a bug! Your fix looks good to me.

Another problem is that the page fault handler (exc_page_fault()) is
installed into the IDT before FRED is enabled. Consequently, if a #PF is
triggered in this gap, the handler would receive the wrong CR2 from the
stack if FRED feature is present. To address this, I added a page fault
entry stub for FRED similar to the debug entry. However, I'm uncertain
whether this is enough reason to add a new entry. Perhaps a static key
may suffice to indicate whether FRED setup is completed and the handler
can use it.

How could a #PF get triggered during that gap?

Initialization time funnies are really unpleasant.

I'm not sure if there will be a #PF during that gap; I just received the
wrong fault address when I made a mistake in that gap and a #PF
occurred. Before idt_setup_early_pf(), the registered page fault handler
is do_early_exception(), which uses native_read_cr2(). However, after
that, the page fault handler had been changed to exc_page_fault(), so if
something bad happened and an unexpected #PF occurred, the fault address
in the error output will be wrong, although the CR2 in __show_regs() is
correct. I'm not sure if this matters or not since the kernel will panic
at that time.

So this doesn't sound a real problem, right?

We could simply do:

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index e6c469b323cc..e500777ed2b4 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1495,7 +1495,7 @@ DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault)
irqentry_state_t state;
unsigned long address;

- address = cpu_feature_enabled(X86_FEATURE_FRED) ? fred_event_data(regs) : read_cr2();
+ address = native_read_cr4() & X86_CR4_FRED ? fred_event_data(regs) : read_cr2();

prefetchw(&current->mm->mmap_lock);


But the page fault handler is an extreme hot path, it's not worth it.

Thanks!
Xin