Re: Rethinking sigcontext's xfeatures slightly for PKRU's benefit?

From: Andy Lutomirski
Date: Fri Dec 18 2015 - 15:28:49 EST


On Fri, Dec 18, 2015 at 12:07 PM, Dave Hansen
<dave.hansen@xxxxxxxxxxxxxxx> wrote:
> On 12/18/2015 11:21 AM, Andy Lutomirski wrote:
>> On Fri, Dec 18, 2015 at 10:42 AM, Dave Hansen
>> <dave.hansen@xxxxxxxxxxxxxxx> wrote:
>>> On 12/18/2015 08:04 AM, Andy Lutomirski wrote:
>>>> 1b. If the app malfunctions such that RSP points to pmem, the kernel
>>>> MUST NOT clobber the pmem space. I think that this basically mandates
>>>> that PKRU needs to have some safe state (i.e. definitely not the init
>>>> state) on signal delivery: the kernel is going to write a signal frame
>>>> at the address identified by RSP, and that address is in pmem, so
>>>> those writes need to fail.
>>>
>>> The kernel is writing the signal frame using normal old copy_to_user().
>>> Those are writing through mappings with _PAGE_USER set and should be
>>> subject to the PKRU state of the thread before the signal started to be
>>> delivered.
>>>
>>> We don't do the fpu__clear() until after this copy, so I think pkeys
>>> enforcement is being done properly for this today.
>>
>> True, but I think only in a very limited sense. Your average signal
>> handler is reasonably like to execute "push $rbp" as its very first
>> instruction, at which point we're immediately screwed with the current
>> arrangement.
>
> I completely agree that there's a window for corruption.
>
> But, I think it's a small one. Basically, RSP would have to pointing at
> a place which was allowed by protection keys for all of the sigframe
> setup. Then, _just_ happened to be at a place which was denied by
> protection keys when it enters the signal handler back in userspace.
> It's possible, but it's a small window.

That's true.

OTOH, I think that the signal in the middle of the wrpkru-allowed
window is also a compelling case. So maybe my new preference is
option B, where we save and restore PKRU like all the other xfeatures
but, rather than clearing it in the signal context, we set it to the
syscall-specified value.

If we do this, then we'd have to clearly separate the
syscall-specified value from the xstate value, and we'd probably want
one of the syscalls to offer an option to force user PKRU to match the
syscall value. Maybe your code already does this -- I didn't check.

This has the nice property that sigreturn isn't affected at all except
insofar as we should give a little bit of thought to the ordering of
restoring PKRU relative to any other uaccess. I'd imagine that we'd
just restore PKRU last, in case the stack we're restoring from has
access disallowed by the saved PKRU value.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/