RE: [PATCH] x86/mce: Dynamically size space for machine check records

From: Luck, Tony
Date: Thu Feb 29 2024 - 13:39:12 EST


> Wouldn't having dedup actually increase the time we spend #MC context?
> Comparing the new MCE record against each existing record in the
> genpool.

Yes, dedup would take extra time (increasing linearly with the number
of pending errors that were not filtered out by the dedup process).

> AFAIK, MCEs cannot be nested. Correct me if I am wrong here.

Can't be nested on the same CPU. But multiple CPUs may take
a local machine check simultaneously. Local machine check is
opt-in on Intel, I believe it is default on AMD.

Errors can also be signaled with CMCI.

> In a flood situation, like the one described above, that is exactly
> what may happen: An MCE coming in while the dedup mechanism is
> underway (in #MC context).

In a flood of errors it would be complicated to synchronize dedup filtering
on multiple CPUs. The trade-off between trying to get that code right,
and just allocating a few extra Kbytes of memory would seem to favor
allocating more memory.

--
Thanks,
Avadhut Naik