Re: [PATCH v3 3/3] sched, x86: Check that we're on the right stack in schedule and __might_sleep

From: Andy Lutomirski
Date: Wed Nov 19 2014 - 19:13:56 EST


On Wed, Nov 19, 2014 at 3:59 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Wed, Nov 19, 2014 at 3:49 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>
>> My only real objection is that it's going to be ugly and error prone.
>> It'll have to be something like:
>
> No.
>
>> because the whole point of this series is to make the IST entries not
>> be atomic when they come from userspace.
>
> Andy, you need to lay off the drugs.
>

No drugs, just imprecision. This series doesn't change NMI handling
at all. It only changes machine_check int3, debug, and stack_segment.
(Why is #SS using IST stacks anyway?)

So my point stands: if machine_check is going to be conditionally
atomic, then that condition needs to be expressed somewhere. And
machine_check really does need to sleep, and it does so today. It's
just that, in current kernels, it pretends to be atomic, but then it
fakes everyone out before returning (look for sync_regs in entry_64.S)
and becomes non-atomic part-way through, mediated by a special TIF
flag. I think that the current conditional sync_regs thing is even
crazier than switching stacks depending on regs->cs, so I'm trying to
reduce craziness.

With this series applied, machine_check is honest: if it came from
user space, it doesn't even pretend to be atomic.

And I'm not even making these things preemptable by default. They
still run with irqs off. They're just allowed to turn irqs off *if
user_mode_vm(regs)* if they want.

> NMI absolutely *has* to be atomic. The whole "oh, there's a per-core
> NMI flag and it disables all other NMI's and interrupts" kind of
> enforces that.
>
> Trust me. Talking about being able to preempt the NMI handler is just
> crazy talk.

I engage in crazy talk all the time, but I'm not *that* crazy.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/