Re: [PATCH v2 09/16] x86/mce: Unify AMD THR handler with MCA Polling

From: Borislav Petkov
Date: Mon Apr 29 2024 - 09:41:08 EST


On Thu, Apr 04, 2024 at 10:13:52AM -0500, Yazen Ghannam wrote:
> @@ -787,6 +793,8 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
> mce_log(&m);
>
> clear_it:
> + vendor_handle_error(&m);

Wait, whaaat?

The normal polling happens periodically (each 5 mins) and you want to
reset the thresholding blocks each 5 mins?

And the code has there now:

static void reset_block(struct threshold_block *block)
{

..

/* Reset threshold block after logging error. */
memset(&tr, 0, sizeof(tr));
tr.b = block;
threshold_restart_bank(&tr);
}

but no error has been logged.

Frankly, I don't see the point for this part: polling all banks on
a thresholding interrupt makes sense. But this resetting from within the
polling doesn't make any sense.

Especially if that polling interval is user-controllable.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette