Re: [PATCH v2 1/3] x86/mce/inject: Remove call to mce_notify_irq()

From: Nikolay Borisov
Date: Mon Feb 17 2025 - 06:21:05 EST




On 17.02.25 г. 12:01 ч., Nikolay Borisov wrote:


On 16.02.25 г. 18:10 ч., Borislav Petkov wrote:
On Mon, Feb 10, 2025 at 05:47:04PM +0200, Nikolay Borisov wrote:
The call is actually a noop because when the MCE is raised the early
notifier is the only call site that correctly calls mce_notify_irq()
because it also sets mce_need_notify. Remove this call and as a result
make mce_notify_irq() static

Signed-off-by: Nikolay Borisov <nik.borisov@xxxxxxxx>
---
  arch/x86/include/asm/mce.h       |  2 --
  arch/x86/kernel/cpu/mce/core.c   | 44 ++++++++++++++++----------------
  arch/x86/kernel/cpu/mce/inject.c |  1 -
  3 files changed, 22 insertions(+), 25 deletions(-)

So what you're looking at are the remnants of the old soft-inject of MCE
errors. And it seems that we lost some of that functionality along the way and
no one has noticed because, well, it seems no one uses it anymore.

In order to understand how this thing was supposed to work, checkout

ea149b36c7f5 ("x86, mce: add basic error injection infrastructure")


The original code in  ea149b36c7f5 was setting the notify_user bit via

raise_mce()->machine_check_poll()->mce_log(),
 mce_notify_user() - consumes notify_user set in mce_log above.


 subsequently in 011d82611172 ("RAS: Add a Corrected Errors Collector") you factored out the code from mce_log() mce_first_notifier, where the bit setting remains to this day, but since it's been removed from mce_log() it made the call in raise_local() defunct.


Considering that no one complained about this after all these years and that the dev-mcelog is deprecated I think it still makes sense to make mce_notify_irq() private


Actually there is no loss of functionality since after an MCE is injected
the early notifier will correctly call the usermode helper. So I
think the following change log better reflects the situation:


x86/mce/inject: Remove call to mce_notify_irq()
The call to mce_notify_irq() has been there since the initial version of
the soft inject mce machinery, introduced in ea149b36c7f5 ("x86,
mce: add basic error injection infrastructure"). At that time it was
functional since injecting an MCE resulted in the following call chain:
raise_mce()
->machine_check_poll()
->mce_log() - sets notfiy_user_bit
->mce_notify_user() (current mce_notify_irq) consumed the bit and called the
usermode helper.
However, with the introduction of 011d82611172 ("RAS: Add a Corrected Errors Collector")
the code got moved around and the user mode helper began to be called
via the early notifier (mce_first_notifier()) rendering the remaining call
in raise_local() defunct as the mce_need_notify bit (ex notify_user) is
only being set from the early notifier.
Remove the noop call and make mce_notify_irq static. No functional
changes.



and follow what raise_mce() does and pay attention to notify_user which is
what mce_need_notify was called back then.

Remember to have fun :-P