RE: [RFD PATCH] x86/mce: Make sure to send SIGBUS even after losing the race to poison a page
From: Luck, Tony
Date: Thu Sep 03 2020 - 13:09:57 EST
> Let's see if that logic makes sense: if #MC offlines the page and sends
> SIGBUS but CMCI only offlines the page, isn't it only logical for the
> CMCI to *also* send the SIGBUS too, after having offlined the page?
>
> I.e., both should do the proper and full recovery action. Just sayin...
It made sense, and seemed to explain an issue I was seeing, when I wrote it.
But some stress testing of that patch showed that it introduces some problems
and instability.
Without the patch I can inject 10,000 errors and have every one of them complete
correctly (process gets a SIGBUS with the address of the error). With my patch
around 0.4% of injections fail to provide the address to the SIGBUS handler, worse
the test gets a fatal error every 600-700 injections.
So, I'm abandoning that patch.
-Tony