Re: [PATCH v3] selftests/x86: Fix sysret_rip assertion failure on FRED systems

From: Xin Li

Date: Tue Mar 31 2026 - 02:07:34 EST



>>>> The existing 'sysret_rip' selftest asserts that 'regs->r11 ==
>>>> regs->flags'. This check relies on the behavior of the SYSCALL
>>>> instruction on legacy x86_64, which saves 'RFLAGS' into 'R11'.
>>>>
>>>> However, on systems with FRED (Flexible Return and Event Delivery)
>>>> enabled, instead of using registers, all state is saved onto the stack.
>>>> Consequently, 'R11' retains its userspace value, causing the assertion
>>>> to fail.
>>>>
>>>> Fix this by detecting if FRED is enabled and skipping the register
>>>> assertion in that case. The detection is done by checking if the RPL
>>>> bits of the GS selector are preserved after a hardware exception.
>>>> IDT (via IRET) clears the RPL bits of NULL selectors, while FRED (via
>>>> ERETU) preserves them.
>>>>
>>>
>>> I don't really like this. I think we have two credible choices:
>>>
>>> 1. Define the Linux ABI to be that, on FRED systems, SYSCALL preserves
>>> R11 and RCX on entry and exit. And update the test to actually test
>>> this.
>>>
>>> 2. Define the Linux ABI to be what it has been for quite a few years:
>>> SYSCALL entry copies RFLAGS to R11 and RIP to RCX and SYSCALL exit
>>> preserves all registers.
>>>
>>> I'm in favor of #2. People love making new programming languages and
>>> runtimes and inline asm and, these days, vibe coded crap. And it's
>>> *easier* to emit a SYSCALL and forget to tell the compiler / code
>>> generator that RCX and R11 are clobbered than it is to remember that
>>> they're clobbered. And it's easy to test on FRED (well, not really,
>>> but it hopefully will be some day) and it's easy to publish one's
>>> code, and then everyone is a bit screwed when the resulting program
>>> crashes sometimes on non-FRED systems. And it will be miserable to
>>> debug.
>>>
>>> (It's *really* *really* easy to screw this up in a way that sort of
>>> works even on non-FRED: RCX and R11 are usually clobbered across
>>> function calls, so one can get into a situation in which one's
>>> generated code usually doesn't require that SYSCALL preserve one of
>>> these registers until an inlining decision changes or some code gets
>>> reordered, and then it will start failing. And making the failure
>>> depend on hardware details is just nasty.
>>>
>>> So I think we should add the ~2 lines of code to fix the SYSCALL entry
>>> on FRED to match non-FRED.
>>
>> Yes; I'm afraid I have to concur. Preserving the clobber on entry for
>> FRED systems is by far the safest choice.
>>
>> Aside from this selftest, fancy debuggers and anything that can transfer
>> userspace state between machines might be 'surprised'.
>
> Thanks Andy and Peter.
>
> Indeed, making the selftest branch on FRED vs. non-FRED behavior
> is not a good practice. The selftest should validate ABI consistency.
>
> I agree with Andy's option #2, so this should be fixed in the FRED
> syscall entry implementation.
>
> Li Xin, does this direction look right to you? I can assit with
> validation and keep the selftest aligned with the agreed ABI.
>

Yes, consistency should take precedence over hardware-specific variations.

I would like to hear from Andrew Cooper and hpa before we do it.