Re: [PATCH v1] x86: Pin cr4 FSGSBASE

From: Greg KH
Date: Wed May 27 2020 - 03:07:59 EST


On Tue, May 26, 2020 at 07:24:03PM +0200, Wojtek Porczyk wrote:
> On Tue, May 26, 2020 at 06:32:35PM +0200, Greg KH wrote:
> > On Tue, May 26, 2020 at 08:48:35AM -0700, Andi Kleen wrote:
> > > On Tue, May 26, 2020 at 08:56:18AM +0200, Greg KH wrote:
> > > > On Mon, May 25, 2020 at 10:28:48PM -0700, Andi Kleen wrote:
> > > > > From: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> > > > >
> > > > > Since there seem to be kernel modules floating around that set
> > > > > FSGSBASE incorrectly, prevent this in the CR4 pinning. Currently
> > > > > CR4 pinning just checks that bits are set, this also checks
> > > > > that the FSGSBASE bit is not set, and if it is clears it again.
> > > >
> > > > So we are trying to "protect" ourselves from broken out-of-tree kernel
> > > > modules now?
> > >
> > > Well it's a specific case where we know they're opening a root hole
> > > unintentionally. This is just an pragmatic attempt to protect the users in the
> > > short term.
> >
> > Can't you just go and fix those out-of-tree kernel modules instead?
> > What's keeping you all from just doing that instead of trying to force
> > the kernel to play traffic cop?
>
> We'd very much welcome any help really, but we're under impression that this
> couldn't be done correctly in a module, so this hack occured.

Really? How is this hack anything other than a "prevent a kernel module
from doing something foolish" change?

Why can't you just change the kernel module's code to not do this? What
prevents that from happening right now which would prevent the need to
change a core api from being abused in such a way?

> This was written in 2015 as part of original (research) codebase for those
> reasons:
> - A module is easier to deploy by scientists, who are no kernel developers and
> no sysadmins either, so applying patchset and recompiling kernel is a big
> ask.
> - It has no implications on security in SGX/Graphene threat model and in
> expected deployment scenario.
> - This had no meaning to the actual research being done, so it wasn't cared
> about.
>
> Let me expand the second point, because I understand both the module and the
> explanation looks wrong.
>
> Graphene is intended to be run in a cloud, where the CPU time is sold in
> a form of virtual machine, so the VM kernel, which would load this module, is
> not trusted by hardware owner, so s/he don't care. But the owner of the
> enclave also doesn't care, because SGX' threat model assumes adversary who is
> capable of arbitrary code execution in both kernel and userspace outside
> enclave. So the kernel immediately outside the enclave is a no-man's land,
> untrusted by both sides and forsaken, reduced to a compatibility layer
> between x86 and ELF.
>
> I acknowledge this is unusual threat model and certainly to mainline
> developers, who rarely encounter userspace that is more trusted than kernel.
>
> What we've failed at is to properly explain this, because if someone loads
> this module outside of this expected scenario, will certainly be exposed to
> a gaping root hole. Therefore we acknowledge this patch and as part of
> Graphene we'll probably maintain a patchset, until the support is upstream.
> Right now this will take us some time to change from our current kernel
> interfaces.

I'm sorry, but I still do not understand. Your kernel module calls the
core with this bit being set, and this new kernel patch is there to
prevent the bit from being set and will WARN_ON() if it happens. Why
can't you just change your module code to not set the bit?

Do you have a pointer to the kernel module code that does this operation
which this core kernel change will try to prevent from happening?

thanks,

greg k-h