Re: [PATCH v2 0/2] x86/mm: support memory-failure on 32-bits with SPARSEMEM
From: Dave Hansen
Date: Tue Nov 04 2025 - 09:30:06 EST
On 11/3/25 23:23, Xie Yuanbin wrote:
> Memory bit flips are among the most common hardware errors in the server
> and embedded fields, many hardware components have memory verification
> mechanisms, for example ECC. When an error is detected, some hardware or
> architectures report the information to software (OS/BIOS), for example,
> the MCE (Machine Check Exception) on x86.
>
> Common errors include CE (Correctable Errors) and UE (Uncorrectable
> Errors). When the kernel receives memory error information, if it has the
> memory-failure feature, it can better handle memory errors without reboot.
> For example, kernel can attempt to offline the affected memory by
> migrating it or killing the process. Therefore, this feature is widely
> used in servers and embedded fields.
>
> For historical versions, memory-failure cannot be enabled with x86_32 &&
> SPARSEMEM because the number of page-flags are insufficient. However, this
> issue has been resolved in the current version, and this patch will allow
> SPARSEMEM and memory-failure to be enabled together on x86_32.
>
> By the way, due to increased demand, DRAM prices have recently
> skyrocketed, making memory-failure potentially even more valuable in the
> coming years.
Which LLM generated that for you, btw?
I wanted to know _specifically_ what kind of hardware or 32-bit
environment you wanted to support with this series, though.