Re: [RFC PATCH v2] x86/arch_prctl: Add ARCH_SET_XCR0 to set XCR0 per-thread

From: Keno Fischer
Date: Tue Apr 07 2020 - 13:56:20 EST


On Tue, Apr 7, 2020 at 12:27 PM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
>
> >> How does this work with things like xstateregs_[gs]et() where the format
> >> of the kernel buffer and thus the kernel XCR0 is exposed as part of our
> >> ABI? With this patch, wouldn't a debugger app see a state buffer that
> >> looks invalid?
> >
> > Since those operate on the in-kernel buffer of these, which
> > in this patch always uses the unmodified XCR0, ptracers
> > should not observe a difference.
>
> Those operate on *BOTH* kernel and userspace buffers. They copy between
> them. That's kinda the point. :)

Right, what I meant was that in this patch the kernel level
xsaves that populates the struct fpu always runs with
an unmodified XCR0, so the contents of the xsave area
in struct fpu does not change layout (this is the major
change in this patch over v1). Are you referring to a ptracer
which runs with a modified XCR0, and assumes that the
value it gets back from ptrace will have an XSTATE_BV
equal to its own observed XCR0 and thus get confused
about the layout of the buffer (or potentially have not copied
all of the relevant xstate because it calculated a wrong buffer
size)? If so, I think that's just a buggy ptracer. The kernel's
xfeature mask is available via ptrace and a well-behaved
ptracer should use that (e.g. gdb does, though it looks like
it then also assumes that the xstate has no holes, so it
potentially gets the layout wrong anyway).

In general, I don't really want the modified XCR0 to affect
anything other than the particular instructions that depend
on it and maybe the signal frame (though as I said before,
I'm open to either here).

If I misunderstood what you were trying to say, I apologize.

> I suspect the patch thus far is only the tip of the iceberg. I'd really
> suggest doing some more thorough audits of all of the locations in the
> kernel that touch the fpu buffer *or* that call XSAVE/XRSTOR. I'm
> pretty sure that audit hasn't been done or the ptrace example would have
> been found already.

Yes, good idea. I will do this again. That said, I'm hoping that
the general principle to use the kernel layout, except perhaps
in the signal frame will hold and thus not require modification.

> But, it's also not my daily go-to for debugging.

Luckily rr is fast enough (after much work) that there's almost
never a reason not to use it. Our developers essentially use
it exclusively (rather than raw gdb), and our users send us rr
traces as bug reports. It's basically become our one-stop-shop
for all things debugging on Linux.