Re: [PATCH v2 4/5] x86/mce: Simplify flow when handling recoverable memory errors

From: Andy Lutomirski
Date: Wed Nov 12 2014 - 12:25:40 EST


On Wed, Nov 12, 2014 at 9:20 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> Sorry, I am a bit confused...
>
> On 11/11, Borislav Petkov wrote:
>>
>> Roughly speaking, we want to be able to mark a task with the sign of
>> death and to kill it, if needed.
>
> "it" is current, yes?

This part I'm not sure about due to MCE broadcast. "It" is current
for exactly one cpu that's in do_machine_check.

>
> So I agree with Andy, task_work_add() can work and you can also pass
> paddr/restartable to the handler.
>
> But,
>
>> The important part is *before* it
>> gets to run again.
>
> But it is already running? Perhaps you meant "before it returns to
> user-mode" ?
>

Right. But killing it in the do_exit sense from do_machine_check is
currently impossible because do_machine_check runs on an IST stack.
With my patch, if do_machine_check sees user_mode_vm(regs) and
CONFIG_X86_64, then it will be running on the real task stack, so it
can enable interrupts, kill the task however it wants, etc. This was
the original motivation for my patch.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/