Re: [PATCH v2 6/7] x86/mce: Recover from poison found while copying from user space

From: Luck, Tony
Date: Mon Oct 05 2020 - 15:14:07 EST


On Mon, Oct 05, 2020 at 06:32:47PM +0200, Borislav Petkov wrote:
> On Wed, Sep 30, 2020 at 04:26:10PM -0700, Tony Luck wrote:
> > arch/x86/kernel/cpu/mce/core.c | 33 +++++++++++++++++++++------------
> > include/linux/sched.h | 2 ++
> > 2 files changed, 23 insertions(+), 12 deletions(-)
>
> Isn't that just simpler?

Yes. A helper function avoids making the code a mess of if/else with subtle
fall through cases.

> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 4d2cf08820af..dc6c83aa2ec1 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -1261,6 +1261,21 @@ static void kill_me_maybe(struct callback_head *cb)
> kill_me_now(cb);
> }
>
> +static inline void queue_task_work(struct mce *m, int kill_it)

Does it need to be "inline" though? I hope machine check processing
is never in the critical code path for anyone!

> +{
> + current->mce_addr = m->addr;
> + current->mce_kflags = m->kflags;
> + current->mce_ripv = !!(m->mcgstatus & MCG_STATUS_RIPV);
> + current->mce_whole_page = whole_page(m);
> +
> + if (kill_it)
> + current->mce_kill_me.func = kill_me_now;
> + else
> + current->mce_kill_me.func = kill_me_maybe;
> +
> + task_work_add(current, &current->mce_kill_me, true);
> +}
> +
> /*
> * The actual machine check handler. This only handles real
> * exceptions when something got corrupted coming in through int 18.
> @@ -1402,13 +1417,8 @@ noinstr void do_machine_check(struct pt_regs *regs)
> /* If this triggers there is no way to recover. Die hard. */
> BUG_ON(!on_thread_stack() || !user_mode(regs));
>
> - current->mce_addr = m.addr;
> - current->mce_ripv = !!(m.mcgstatus & MCG_STATUS_RIPV);
> - current->mce_whole_page = whole_page(&m);
> - current->mce_kill_me.func = kill_me_maybe;
> - if (kill_it)
> - current->mce_kill_me.func = kill_me_now;
> - task_work_add(current, &current->mce_kill_me, true);
> + queue_task_work(&m, kill_it);
> +
> } else {
> /*
> * Handle an MCE which has happened in kernel space but from
> @@ -1423,6 +1433,9 @@ noinstr void do_machine_check(struct pt_regs *regs)
> if (!fixup_exception(regs, X86_TRAP_MC, 0, 0))
> mce_panic("Failed kernel mode recovery", &m, msg);
> }
> +
> + if (m.kflags & MCE_IN_KERNEL_COPYIN)
> + queue_task_work(&m, kill_it);
> }
> out:
> mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);

-Tony