Re: What exactly do 32-bit x86 exceptions push on the stack in the CS slot?

From: Andy Lutomirski
Date: Tue Nov 22 2016 - 12:30:40 EST


On Tue, Nov 22, 2016 at 12:30 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> * Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
>> On Sun, Nov 20, 2016 at 11:13 PM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>> >
>> > So I have applied your fix that addresses the worst fallout directly:
>> >
>> > fc0e81b2bea0 x86/traps: Ignore high word of regs->cs in early_fixup_exception()
>> >
>> > ... but otherwise we might be better off zeroing out the high bits of segment
>> > registers stored on the stack, in all entry code pathways
>>
>> Ugh.
>>
>> I'd much rather we go back to just making the "cs" entry explicitly
>> 16-bit, and have a separate padding entry, the way we used to long
>> long ago.
>>
>> Or just rename it to something that you're not supposed to access
>> directly, and a helper accessor function that masks off the high bits.
>>
>> The entry code-paths are *much* more critical than any of the few user
>> codepaths.
>
> Absolutely, no arguments about that!
>
>> [...] Let's not add complexity to entry. Make the structure actually reflect
>> reality instead.
>
> So I have no problems at all with your suggestion either.
>
> I am still trying to semi-defend my suggestion as well, because if we do what I
> suggested:
>
>> > [...] so that the function call is patched out on modern CPUs.
>
> then it's essentially an opt-in quirk for really old CPUs and won't impact new
> CPUs, other than a single NOP for the patched out bits - and not even that on
> kernel builds with M686 or later or so ...
>
> I.e. the quirk essentially implements what new CPUs do (in C), and then all
> remaining code can just assume that all data is properly initialized/zeroed like
> on new CPUs and the effects of the quirk does not spread to data structures and
> code that handles and copies around those data structures - unless I'm missing
> something.

The SDM says:

If the source operand is an immediate of size less than the operand
size, a sign-extended value is pushed on
the stack. If the source operand is a segment register (16 bits) and
the operand size is 64-bits, a zero-
extended value is pushed on the stack; if the operand size is 32-bits,
either a zero-extended value is pushed
on the stack or the segment selector is written on the stack using a
16-bit move. For the last case, all recent
Core and Atom processors perform a 16-bit move, leaving the upper
portion of the stack location unmodified.

This makes me think that even new processors are quirky.

--Andy