Re: [PATCH 05/22] x86/fpu: Remove fpu->initialized usage in copy_fpstate_to_sigframe()

From: Sebastian Andrzej Siewior
Date: Wed Jan 16 2019 - 17:40:49 EST

On 2019-01-16 20:36:03 [+0100], Borislav Petkov wrote:
> On Wed, Jan 09, 2019 at 12:47:27PM +0100, Sebastian Andrzej Siewior wrote:
> > Since ->initialized is always true for user tasks and kernel threads
> > don't get this far,
> Yeah, this is commit message is too laconic. Don'g get this far "where"?

To reach copy_fpregs_to_sigframe(). A kernel thread never invokes
copy_fpregs_to_sigframe(). Which means only user threads invoke
copy_fpregs_to_sigframe() and they have ->initialized set to one.

> > we always save the registers directly to userspace.
> We don't save registers to userspace - please write stuff out.

Actually we do. copy_fpregs_to_sigframe() saves current FPU registers to
task's stack frame which is userspace memory.

> So from looking at what you're removing I can barely rhyme up what
> you're doing but this needs a lot more explanation why it is OK to
> remove the else case. Hell, why was the thing added in the first place
> if ->initialized is always true?

I think *parts* of the ->initialized field was wrongly converted while
lazy-FPU was removed *or* it was forgotten to be removed afterwards. Or
I don't know but it looks like a leftover.

At the beginning (while it was added) it was part of the lazy-FPU code.
So if tasks's FPU register are not active then they are saved in task's
FPU struct. So in this case (the else path) it does
__copy_to_user(buf_fx, xsave, fpu_user_xstate_size)

In the other case (task's FPU struct is not up-to date, the current
FPU register content is in CPU's registers) it does

which copies CPU's registers. In both cases it copies them (the FPU
registers) to the task's stack frame (the same location). Easy so far?

How does using_compacted_format() fit in here?
The point is that the "compacted" format is never exposed to
userland so it requires normal xsave. So far so good, right? But how
does it work in in the '->initialized = 0' case right? It was
introduced in commit
99aa22d0d8f7 ("x86/fpu/xstate: Copy xstate registers directly to the signal frame when compacted format is in use")

and it probably does not explain why this works, right?
So *either* fpregs_active() was always true if the task used FPU *once*
or if it used FPU *recently* and task's FPU register are active (I don't
remember anymore). Anyway:
a) we don't get here because caller checks for fpregs_active() before
invoking copy_fpstate_to_sigframe()
b) a preemption check resets fpregs_active() after the first check
then we do "xsave", xsaves traps because FPU is off/disabled, trap
loads task's FPU registers, gets back to "xsave", "xsave" saves
CPU's register to the stack frame.

The b part does not work like that since commit
bef8b6da9522 ("x86/fpu: Handle #NM without FPU emulation as an error")

but then at that point it was "okay" because fpregs_active() would
return true if the task used FPU registers at least once. If it did not
use them then it would not invoke that function (the caller checks for

> And why is it ok to save registers directly to the user task's buffers?
So I can't tell you why it is okay but I can explain why it is done
(well, that part I puzzled together).
The task is running and using FPU registers. Then an evil mind sends a
signal. The task goes into kernel, prepares itself and is about to
handle the signal in userland. It saves its FPU registers on the stack
frame. It zeros its current FPU registers (ready for a fresh start),
loads the address of the signal handler and returns to user land
handling the signal.

Now. The signal handler may use FPU registers and the signal handler
maybe be preempted so you need to save the FPU registers of the signal
handler and you can't mix them up with the FPU register's of the task
(before it started handling the signal).

So in order to avoid a second FPU struct it saves them on user's stack
frame. I *think* this (avoiding a second FPU struct) is the primary
motivation. A bonus point might be that the signal handler has a third
argument the `context'. That means you can use can access the task's FPU
registers from the signal handler. Not sure *why* you want to do so but
yo can.
I can't imagine a use case and I was looking for a user and expecting it
to be glibc but I didn't find anything in the glibc that would explain
it. Intel even defines a few bytes as "user reserved" which are used by
"struct _fpx_sw_bytes" to add a marker in the signal and recognise it on
The only user that seems to make use of that is `criu' (or it looked
like it does use it). I would prefer to add a second struct-FPU and use
that for the signal handler. This would avoid the whole dance here. And
`criu' could maybe become a proper interface. I don't think as of now
that it will break something in userland if the signal handler suddenly
does not have a pointer to the FPU struct.

> So please be more verbose even at the risk of explaning the obvious.
> This is the FPU code, remember? :)

Okay. So I was verbose *now*. Depending on what you say (or don't) I
will try to recycle this into commit message in a few days.

> Thx.