Re: [PATCH v1 1/1] x86/fred: Fix the FRED RSP0 MSR out of sync with its per CPU cache

From: Xin Li
Date: Wed Jan 08 2025 - 18:33:35 EST


On 1/8/2025 3:04 PM, Dave Hansen wrote:
On 1/7/25 18:36, Xin Li (Intel) wrote:
The FRED RSP0 MSR (pointing to the top of the kernel stack for user
level event delivery) and its per CPU cache should be kept in sync to
avoid redundant writes in the exit to user space path, as a result,
a write to the FRED RSP0 MSR is paired with a write to its per CPU
cache as fred_update_rsp0() does.

I _think_ you're trying to explain the general use of a per-cpu MSR
cache. That's good. But I was reading this paragraph and thinking at
this point that the bug had something to do with redundant writes to the
MSR.

How about this?

The FRED RSP0 MSR is only used for delivering events when
running userspace. This kernel leverages this property to reduce
expensive MSR writes and optimize context switches. The kernel
only writes the MSR when about to run userspace *and* when the
MSR has actually changed since the last time userspace ran.

This optimization is implemented by maintaining a per-cpu cache
of FRED RSP0 and then checking that against the value for the
current task's stack before running userspace.

However cpu_init_fred_exceptions() writes the MSR without
updating the per-cpu cache. This means that the kernel might
return to userspace with MSR_IA32_FRED_RSP0==0 when it needed to
point to the current task stack. This would induce a double
fault (#DF), which is bad.

A context switch after cpu_init_fred_exceptions() can paper over
the issue since it updates the cached value. That evidently
happens most of the time explaining how this bug got through.


This is a way better explanation, thanks a lot!