Re: [PATCH v13] x86, mce: Add memcpy_trap()

From: Andy Lutomirski
Date: Thu Feb 25 2016 - 17:11:36 EST


On Feb 25, 2016 12:39 PM, "Linus Torvalds"
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> But doing things like
>
> + if (r.trap_nr == X86_TRAP_MC) {
> + volatile void *fault_addr = (volatile void *)from + n
> - r.bytes_left;
> + phys_addr_t p = virt_to_phys(fault_addr);
> +
> + memory_failure(p >> PAGE_SHIFT, MCE_VECTOR, 0);
> + }
>
> in the copying code is insane, because dammit, that should be done by
> the codethat sets X86_TRAP_MC in the first place.

Impossible as such, I think :(

do_machine_check uses IST, the memory failure code can sleep, and you
can't sleep in IST context. There's a special escape that lets
memory_failure sleep *if* it came from user mode.

Here's the solution I'd prefer. Change all the copy string to/from
user code to use the new enhanced fixup code. Have the new fixup
handler (which can be a short C function!) fix up regs->ip to point to
copy_user_handle_tail and add a new parameter to copy_user_handle_tail
indicating the fault type. Then put whatever fixup logic is needed in
copy_user_handle_tail -- it knows the failing address (obviously), and
it's running in process context with interrupts on (unless we're in a
pagefault_disable section), and it can do whatever it needs to do.

Linus, it's kind of like yours, except with the trap info explicitly
passed to the fixup handler instead of having the fixup handler fish
it out of some per-thread structure.

Here are different some ideas I don't like.:

1. The machine check does an IPI-to-self and the failure code runs in
IRQ context.

2. The machine check code rewrites the return stack to inject a
function call. I don't love this.

3. Drop the idea of sending an immediate sigbus and do it with
task_work. Maybe this is bad for some reason other than code
messiness.

4. Change the entry code so machine check runs on the normal stack if
it hits with IRQs on.

>

> And if there is hardware that raises a machine check without actually
> telling you why - including the address - then it's laugable to talk
> about "recoverability" and "hardening" and things like that. Then the
> hardware is just broken.
>
> Linus