Re: [PATCH] x86: sgx: Don't track poisoned pages for reclaiming
From: Jarkko Sakkinen
Date: Tue Feb 11 2025 - 16:04:08 EST
On Tue, Feb 11, 2025 at 08:25:58AM -0800, Dave Hansen wrote:
> > arch_memory_failure() but stay on sgx_active_page_list.
> > page->poison is not checked in the reclaimer logic meaning that a page could be
> > reclaimed and go through ETRACK, EBLOCK and EWB. This can lead to the
> > firmware receiving and MCE in one of those operations and going into
> > "unbreakable shutdown" and triggering a kernel panic on remaining cores.
>
> This requires low-level SGX implementation knowledge to fully
> understand. Both what "ETRACK, EBLOCK and EWB" are in the first place,
> how they are involved in reclaim and also why EREMOVE doesn't lead to
> the same fate.
Does it? [I'll dig up Intel SDM to check this]
BR, Jarkko