Re: [PATCH v2 00/13] Dynamic Kernel Stacks

From: Zach O'Keefe

Date: Fri Jun 19 2026 - 15:21:53 EST

> Aside of that the part which worries me most is the IDT hackery. That's
> fragile as hell and full of unvalidated assumptions. Reading "should not
> happen" several times in a changelog doesn't make me more confident.
>
> "It is possible for #MCE to occur on the #PF IST stack, but the #MCE
> handler shouldn't generate new #PFs. The reentrancy check on the #PF
> stack will trigger if any recoverable #MCEs do generate #PFs - if there
> are actually reports of it happening, we can address it then."
>
> Seriously?
>
> We don't wait until the report comes in because the report won't even
> happen in the worst case:
>
> #PF on IST
> ...
> cmp 0, reentrance
> jne abort
>
> #MC
> ...
> #PF rewinds #PF IST
> cmp 0, reentrance
> jne abort <- Not taken because #MC happened before
> it could be set.
>
> IST is fundamentally not suitable for this and I'm sure there are more
> holes in this.
>
> I haven't looked at the FRED side of affairs yet in detail, but the
> handwavy explanation about external interrupts having to be moved to
> stack level 1 and unconditionally bounced back does not really make it
> appealing. I agree that chapter 8.3.4 in the SDM volume 3 is not really
> helpful, but papering over the problem without understanding the root
> cause is not cutting it. If it's a genuine FRED hardware issue, then
> this needs to be understood and documented.
>
> The x86 folks have spent a lot of time to make the horrific x86
> interrupt and exception handling solid and therefore have zero interest
> to deal with the fallout of something based on "shouldn't happen"
> assumptions. Either it can prove correctness under all circumstances or
> not.
>
> I understand the save tons of memory accross a fleet argument, but a
> large fleet is also a guarantee to trigger all the "should not happen
> and impropable" issues which are gracefully handwaved away. That's a
> truly bad tradeoff as it ends up in non-decodable bug reports. What's
> worse the have to be handled by the maintainers and not necessarily by
> those who implemented it.

Thanks Dave / Thomas / Hans ; I appreciate your time taking a look at this.

As Dave previously pointed out, I'll admit to some ignorance regarding
the subtle nuances of x86 interrupt / exception handling. Counter to
my goals here, that code has "just worked," so attention and time have
been spent elsewhere. We'll undoubtedly need help making things solid
and avoiding previous pitfalls. As David mentions, this is an RFC.

While it seems common opinion that the IST-based solution is fragile,
what of FRED? It seems like this is exactly the kind of support needed
to avoid some of the aforementioned sw "mess" in various x86 exception
handling paths. I agree that it's less-than-ideal that we are forced
to downgrade exception levels in the common #PF case, but is that an
unsurmountable problem? Pardon my ignorance.

Lastly, I just want to clarify what folks have meant by "extraordinary
claims" or "evidence". Aside from the above discussion on FRED
exception handling, the "only" other part of this is the allocation.
Are people concerned about memory unavailability, deadlocking-type
issues, or something else? We have considerable design freedom here to
avoid certain classes of unreliability, but—barring any clever
tricks—I don't know if the allocation can be guaranteed to succeed in
all conceivable circumstances. I want to ensure that reality does not
present a hard blocker.

Again, thanks everyone for the time and help,

Best,
Zach