Re: [PATCH 3/5] x86/mce: Add new "handled" field to "struct mce"

From: Luck, Tony
Date: Thu Feb 13 2020 - 17:09:57 EST


On Thu, Feb 13, 2020 at 05:56:17PM +0100, Borislav Petkov wrote:
> On Wed, Feb 12, 2020 at 12:46:50PM -0800, Tony Luck wrote:
> > There can be many different subsystems register on the mce handler
> > chain. Add a new bitmask field and define values so that handlers
> > can indicate whether they took any action to log or otherwise
> > handle an error.
> >
> > The default handler at the end of the chain can use this information
> > to decide whether to print to the console log.
> >
> > Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
> > ---
> > arch/x86/include/uapi/asm/mce.h | 9 +++++++++
> > 1 file changed, 9 insertions(+)
> >
> > diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
> > index 955c2a2e1cf9..99ca07f7b078 100644
> > --- a/arch/x86/include/uapi/asm/mce.h
> > +++ b/arch/x86/include/uapi/asm/mce.h
> > @@ -35,8 +35,17 @@ struct mce {
> > __u64 ipid; /* MCA_IPID MSR: only valid on SMCA systems */
> > __u64 ppin; /* Protected Processor Inventory Number */
> > __u32 microcode; /* Microcode revision */
> > + __u32 handled; /* Bitmap of logging/handling actions */
> > };
> >
> > +/* handled flag bits */
> > +#define MCE_HANDLED_CEC BIT(0)
> > +#define MCE_HANDLED_UC BIT(1)
> > +#define MCE_HANDLED_EXTLOG BIT(2)
> > +#define MCE_HANDLED_NFIT BIT(3)
> > +#define MCE_HANDLED_EDAC BIT(4)
> > +#define MCE_HANDLED_MCELOG BIT(5)
> > +
> > #define MCE_GET_RECORD_LEN _IOR('M', 1, int)
> > #define MCE_GET_LOG_LEN _IOR('M', 2, int)
> > #define MCE_GETCLEAR_FLAGS _IOR('M', 3, int)
> > --
>
> Not sure if this should be exposed to user. I don't think it has any
> business poking its nose into how the MCE was handled. Or maybe it does
> but I cannot think of a good example ATM.
>
> If not, this could be
>
> ...
> void *private;
> };
>
> which userspace can't make any assumptions about. And we can put
> whatever we need in there...

I can see various ways to spin this:

1) It is useful to user mode. The mcelog(8) daemon (or other consumer
of "struct mce") gets a record of where to look for logs from this
record. This could reduce the anxiety about logging the same item
multiple times. Its a bit weird though because each entity logging
only sees who came before them, not who came after.
2) Not useful
2a) Keep it in the structure, but clear it in copies shown to user
2b) Make a *private to point to such things (but that really
complicates allocation of struct mce ... right now we just
have local copies on kernel stack)
2c) Make a wrapper structure:
struct kernel_mce {
struct mce mce;
u32 handled;
... other hidden stuff ...
};

-Tony