Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

From: Linus Torvalds
Date: Fri Apr 24 2015 - 13:41:16 EST


On Fri, Apr 24, 2015 at 10:33 AM, Brian Gerst <brgerst@xxxxxxxxx> wrote:
>
> To clarify, I was thinking of the CONFIG_PREEMPT case. A nested
> interrupt wouldn't change SS, and IST interrupts can't schedule.

It has absolutely nothing to do with nested interrupts or CONFIG_PREEMPT.

The problem happens simply because

- process A does a system call SS=__KERNEL_DS

- the system call sleeps for whatever reason. SS is still __KERNEL_DS

- process B runs, returns to user space, and takes an interrupt. Now SS=0

- process B is about to return to user space (where the interrupt
happened), but we schedule as part of that regular user-space return.
SS=0

- process A returns to user space using sysret, the SS selector
becomes __USER_DS, but the cached descriptor remains non-present

Notice? No nested interrupts, no CONFIG_PREEMPT, nothing special at all.

The reason Luto's patch fixes the problem is that now the scheduling
from B back to A will reload SS, making it __KERNEL_DS, but more
importantly, fixing the cached descriptor to be the usual present flag
one, which is what the AMD sysret instruction needs.

Or do I misunderstand what you are talking about?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/