Re: [RFC 3/4] x86/signal/64: Re-add support for SS in the 64-bit signal context

From: Andy Lutomirski
Date: Wed Oct 14 2015 - 14:06:53 EST


On Wed, Oct 14, 2015 at 10:40 AM, Stas Sergeev <stsp@xxxxxxx> wrote:
> 14.10.2015 19:40, Andy Lutomirski ÐÐÑÐÑ:
>>>> + *
>>>> + * Kernels that set UC_SIGCONTEXT_SS will also set UC_STRICT_RESTORE_SS
>>>> + * when delivering a signal that came from 64-bit code.
>>>> + *
>>>> + * Sigreturn modifies its behavior depending on the UC_STRICT_RESTORE_SS
>>>> + * flag. If UC_STRICT_RESTORE_SS is set, then the SS value in the
>>>> + * signal context is restored verbatim. If UC_STRICT_RESTORE_SS is not
>>>> + * set, the CS value in the signal context refers to a 64-bit code
>>>> + * segment, and the signal context's SS value is invalid, it will be
>>>> + * replaced by an flat 32-bit selector.
> Is this correct?
> It says "64-bit code segment will use the 32-bit SS".
> I guess you mean 64-bit SS instead of a 32-bit?

There is no such thing as a 64-bit SS. The case this is guarding against is:

void handler(...) {
ctx->cs = [some 64-bit value];
modify_ldt(zap the old SS);
return;
}

Old DOSEMU does this IIUC. It's trying to switch back to 64-bit mode,
and the value of SS that gets loaded into the SS selector doesn't
matter, but something *valid* needs to be loaded. (Remember the weird
ISA design: the SS descriptor is basically irrelevant in 64-bit mode,
but it still has to be valid.) On old kernels, this works, because
sigreturn zaps SS unconditionally. On new kernels, it'll be
interpreted as an attempt to change CS and restore the old SS, but
that SS is no longer valid. The fixup is to avoid sending a new
signal and to instead do what DOSEMU expected.

> Also it doesn't seem to be saying what happens if CS is 32-bit
> and SS is invalid (the flag is not set).

A new signal will be delivered. sigreturn doesn't modify its behavior
in this case -- it does the default thing, which is to honor the SS in
the saved context. So it will actually try to use that saved SS
value, which will fail, causing SIGSEGV.

>
>>>> This is a bit risky, and another option would be to do nothing at
>>>> all.
>>> Andy, could you please stop pretending there are no other solutions?
>>> You do not have to like them. You do not have to implement them.
>>> But your continuous re-assertions that they do not exist, make me
>>> feel a bit uncomfortable after I spelled them many times.
>>>
>>>> Stas, what do you think? Could you test this?
>>> I think I'll get to testing this only at a week-end.
>>> In a mean time, the question about a safety of leaving LDT SS
>>> in 64bit mode still makes me wonder. Perhaps, instead of re-iterating
>>> this here, you can describe this all in the patch comments? Namely:
>>> - How will LDT SS interact with nested signals
>>
>> The kernel doesn't think about nested signals. If the inner signal is
>> delivered while SS is in the LDT, the kernel will try to keep it as is
>> and will stick whatever was in SS when the signal happened in the
>> inner saved context. On return to the outer signal, it'll restore it
>> following the UC_STRICT_RESTORE_SS rules.
> Good.
>
>>> - with syscalls
>>
>> 64-bit syscalls change SS to some default flat value as a side-effect.
>> (Actually, IIRC, 64-bit syscalls change it specifically to __USER_DS,
>> but, on Xen, 64-bit fast syscall returns may silently flip it to a
>> different flat selector.)
> Do we need this?
> Maybe it should stop doing so?
>

It's a performance trick IIUC. I don't know enough about Xen's
innards to know whether this could be cleaned up without incurring
nasty overheads. As a guess, Xen doesn't want to change the MSR
controlling the SYSRET selector layout when switching guests, so it
uses a default value that doesn't match Linux's. Linux mostly ignores
this, and it only really matters if user code cares which flat
selector gets loaded.

This shouldn't have much effect on segmented programs, as they don't
use SYSRET in segmented contexts.

>>> - with siglongjmp()
>>
>> siglongjmp is a glibc thing. It should work the same way it always
>> did. If it internally does a syscall (sigprocmask or whatever), that
>> will override SS.
> IMHO this side-effect needs to be documented somewhere.
> I was scared about using it because I thought SS could be left bad.
> Why I think it IS the kernel's problem is because in an ideal world
> the sighandler should not run with LDT SS at all, so there will be no
> fear about a bad SS after siglongjmp().

I agree, but that ship sailed quite a few years ago :(

> And if the sigprocmask() will
> sometime stop validating SS, this can lead to surprises.

Not possible without ISA changes. The SYSCALL instruction itself
forgets the old SS value.

In any event, there's not much the kernel can do about this. You
could ask the glibc people to document some well-defined behavior in
their man pages.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/