RE: [PATCH 11/11] EDAC/ie31200: Switch Raptor Lake-S to interrupt mode
From: Zhuo, Qiuxu
Date: Sun Mar 02 2025 - 21:53:15 EST
Hi Tony,
> From: Luck, Tony <tony.luck@xxxxxxxxx>
> [...]
> Subject: RE: [PATCH 11/11] EDAC/ie31200: Switch Raptor Lake-S to interrupt
> mode
>
> > Raptor Lake-S SoCs notify memory errors via Machine Check. Add
> > interrupt mode support and switch Raptor Lake-S EDAC from polling to
> interrupt mode.
>
> Is this notification #MC (a.k.a. INT#18)? Or CMCI? Or #MC for uncorrected
> errors and CMCI for corrected errors?
We performed correctable error injection, and the correctable error events were
notified by CMCI.
> Corrected errors -> CMCI I can understand. This code should work well for
> that.
> Same for uncorrected patrol scrub errors -> CMCI.
>
> Other uncorrected errors -> #MC is trickier. Does Raptor Lake support
> recovery from other uncorrected errors? If it doesn't, then this driver handler
> will not be called (Linux panicked and never called the functions registered on
> the mce decode chain).
It seems like Raptor Lake doesn't support recovery from uncorrected errors.
When performing the uncorrectable error injection testing, the system hung
(The validation team mentioned this was expected behavior).
>
> Which is perhaps a long way of asking whether you really mean:
>
> Raptor Lake-S SoCs notify memory errors via CMCI. Add interrupt
Yes, the correctable errors are via CMCI, not machine check. For uncorrectable errors,
the system is hung, and no callback to EDAC. I'll update this commit description in
the next version.
> mode support and switch Raptor Lake-S EDAC from polling to interrupt
> mode.
> [...]