Re: hv_hypercall_pg page permissios

From: Christoph Hellwig
Date: Tue Jun 16 2020 - 06:24:19 EST


On Tue, Jun 16, 2020 at 12:23:50PM +0200, Christoph Hellwig wrote:
> On Tue, Jun 16, 2020 at 12:18:07PM +0200, Peter Zijlstra wrote:
> > > It does. But it also means every other user of PAGE_KERNEL_EXEC
> > > should trigger this, of which there are a few (kexec, tboot, hibernate,
> > > early xen pv mapping, early SEV identity mapping)
> >
> > There are only 3 users in the entire tree afaict:
> >
> > arch/arm64/kernel/probes/kprobes.c: page = vmalloc_exec(PAGE_SIZE);
> > arch/x86/hyperv/hv_init.c: hv_hypercall_pg = vmalloc_exec(PAGE_SIZE);
> > kernel/module.c: return vmalloc_exec(size);
> >
> > And that last one is a weak function that any arch that has STRICT_RWX
> > ought to override.
> >
> > > We really shouldn't create mappings like this by default. Either we
> > > need to flip PAGE_KERNEL_EXEC itself based on the needs of the above
> > > users, or add another define to overload vmalloc_exec as there is no
> > > other user of that for x86.
> >
> > We really should get rid of the two !module users of this though; both
> > x86 and arm64 have STRICT_RWX and sufficient primitives to DTRT.
> >
> > What is HV even trying to do with that page? AFAICT it never actually
> > writes to it, it seens to give the physica address to an MSR (which I
> > suspect then writes crud into the page for us from host context).
> >
> > Suggesting the page really only needs to be RX.
> >
> > On top of that, vmalloc_exec() gets us a page from the entire vmalloc
> > range, which can be outside of the 2G executable range, which seems to
> > suggest vmalloc_exec() is wrong too and all this works by accident.
> >
> > How about something like this:
> >
> >
> > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> > index a54c6a401581..82a3a4a9481f 100644
> > --- a/arch/x86/hyperv/hv_init.c
> > +++ b/arch/x86/hyperv/hv_init.c
> > @@ -375,12 +375,15 @@ void __init hyperv_init(void)
> > guest_id = generate_guest_id(0, LINUX_VERSION_CODE, 0);
> > wrmsrl(HV_X64_MSR_GUEST_OS_ID, guest_id);
> >
> > - hv_hypercall_pg = vmalloc_exec(PAGE_SIZE);
> > + hv_hypercall_pg = module_alloc(PAGE_SIZE);
> > if (hv_hypercall_pg == NULL) {
> > wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0);
> > goto remove_cpuhp_state;
> > }
> >
> > + set_memory_ro((unsigned long)hv_hypercall_pg, 1);
> > + set_memory_x((unsigned long)hv_hypercall_pg, 1);
>
> The changing of the permissions sucks. I thought about adding
> a module_alloc_prot with an explicit pgprot_t argument. On x86
> alone at least ftrace would also benefit from that.

The above is also missing a set_vm_flush_reset_perms.