Re: [tip:x86/vdso] x86/vdso32/syscall.S: Do not load __USER32_DS to %ss

From: Andy Lutomirski
Date: Thu Apr 23 2015 - 14:24:42 EST


On Thu, Apr 23, 2015 at 10:14 AM, Borislav Petkov <bp@xxxxxxxxx> wrote:
> On Thu, Apr 23, 2015 at 09:50:17AM -0700, Andy Lutomirski wrote:
>> On Thu, Apr 23, 2015 at 9:41 AM, Denys Vlasenko
>> <vda.linux@xxxxxxxxxxxxxx> wrote:
>> > An alternative fix would be, if we decided to schedule
>> > in an interrupt, check %ss for zero and reload it
>> > with __KERNEL_DS before schedule.
>>
>> For anyone who has the right hardware (not me!), a possible reproducer is here:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/luto/misc-tests.git/
>>
>> make && taskset -c 0 ./sysret_ss_attrs_32
>
> [ 195.438441] traps: sysret_ss_attrs[1745] trap stack segment ip:f77dab87 sp:ffdf0b70 error:0
> [ 196.831952] traps: sysret_ss_attrs[1748] trap stack segment ip:f7786b87 sp:fffc0810 error:0
>
> Ran it twice.

That nails it. We really do leak segment limits to other tasks on AMD
chips. I see at least two questions we should answer before fixing
this:

1. Do we consider this to be enough of a security issue that we want
to fix it for 64-bit userspace as well?

2. Do we fix it at sysret time (at the cost of an ss read even in the
best case on AMD chips) or at context switch time (with the risk of
more ss writes than necessary)?

I slightly favor fixing it at sysret time for both the 32-bit and
64-bit paths., but I'm not really convinced.

Regardless, I'd rather fix this in the kernel than the vdso. I see no
reason at all that we should ever return to 32-bit userspace with a
corrupt SS cached descriptor.

(OK, tiny lie. The vdso approach avoids a nop somewhere on Intel
CPUs. Big deal.)

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/