Re: [RFC PATCH v2] x86/arch_prctl: Add ARCH_SET_XCR0 to set XCR0 per-thread
From: Dave Hansen
Date: Tue Apr 07 2020 - 09:14:36 EST
I didn't review the first attempt at this, but this looks like a really
bad idea to me. I don't 100% buy the arguments that this shouldn't be
done in a VM. The x86 virtualization architecture was literally
designed to hide hardware details like this.
I can't imagine ever merging this until using VMs or at least the KVM
API has been completely ruled out.
I don't doubt that this approach works in a controlled environment.
But, I can't imagine that we could ever wrap sane semantics around such
a beast.
For instance, what would ARCH_SET_XCR0 do in a signal handler? The
components are mostly in their init states so xstate_is_initial() will
pass. But the kernel's XRSTOR might #GP because it might try to restore
states that are not enabled in XCR0 now.
Also, this has to be done before XCR0's state is observed by userspace,
which the kernel can't easily do once the program is off and running.
This tells me that it really needs to be done at execve() time, even
before the first program instruction runs.
These are usually the kinds of things that you figure out when you try
to go write a manpage for one of these suckers.
How does this work with things like xstateregs_[gs]et() where the format
of the kernel buffer and thus the kernel XCR0 is exposed as part of our
ABI? With this patch, wouldn't a debugger app see a state buffer that
looks invalid?
Where else in our ABI is the format of the XSAVE buffer exposed?
I'm extra wary of a v2 that's missing the CodingStyle and changelog
basics as well.
> +static int xcr0_is_legal(unsigned long xcr0)
> +{
> + /* Conservatively disallow anything above bit 9,
> + * to avoid accidentally allowing the disabling of
> + * new features without updating these checks
> + */
> + if (xcr0 & ~((1 << 10) - 1))
> + return 0;
Yay, magic numbers!
This would be much better as a BUILD_BUG_ON().
> + if (!(xcr0 & XFEATURE_MASK_FP))
> + return 0;
> + if ((xcr0 & XFEATURE_MASK_YMM) && !(xcr0 & XFEATURE_MASK_SSE))
> + return 0;
> + if ((!(xcr0 & XFEATURE_MASK_BNDREGS)) !=
> + (!(xcr0 & XFEATURE_MASK_BNDCSR)))
> + return 0;
> + if (xcr0 & XFEATURE_MASK_AVX512) {
> + if (!(xcr0 & XFEATURE_MASK_YMM))
> + return 0;
> + if ((xcr0 & XFEATURE_MASK_AVX512) != XFEATURE_MASK_AVX512)
> + return 0;
> + }
> + return 1;
> +}
This appears to copy (instead of refactoring) code from __kvm_set_xcr(),
yet manages to get the indentation different and wrong.