Re: [patch 31/60] x86/mm/kpti: Add mapping helper functions

From: Andy Lutomirski
Date: Mon Dec 04 2017 - 17:28:27 EST


On Mon, Dec 4, 2017 at 6:07 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> From: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
>
> Add the pagetable helper functions do manage the separate user space page
> tables.
>
> [ tglx: Split out from the big combo kaiser patch ]

> +/*
> + * Take a PGD location (pgdp) and a pgd value that needs to be set there.
> + * Populates the user and returns the resulting PGD that must be set in
> + * the kernel copy of the page tables.
> + */
> +static inline pgd_t kpti_set_user_pgd(pgd_t *pgdp, pgd_t pgd)
> +{
> +#ifdef CONFIG_KERNEL_PAGE_TABLE_ISOLATION
> + if (!static_cpu_has_bug(X86_BUG_CPU_SECURE_MODE_KPTI))
> + return pgd;
> +
> + if (pgd_userspace_access(pgd)) {
> + if (pgdp_maps_userspace(pgdp)) {
> + /*
> + * The user page tables get the full PGD,
> + * accessible from userspace:
> + */
> + kernel_to_user_pgdp(pgdp)->pgd = pgd.pgd;
> + /*
> + * For the copy of the pgd that the kernel uses,
> + * make it unusable to userspace. This ensures on
> + * in case that a return to userspace with the
> + * kernel CR3 value, userspace will crash instead
> + * of running.
> + *
> + * Note: NX might be not available or disabled.
> + */
> + if (__supported_pte_mask & _PAGE_NX)
> + pgd.pgd |= _PAGE_NX;
> + }
> + } else if (pgd_userspace_access(*pgdp)) {
> + /*
> + * We are clearing a _PAGE_USER PGD for which we presumably
> + * populated the user PGD. We must now clear the user PGD
> + * entry.
> + */
> + if (pgdp_maps_userspace(pgdp)) {
> + kernel_to_user_pgdp(pgdp)->pgd = pgd.pgd;
> + } else {
> + /*
> + * Attempted to clear a _PAGE_USER PGD which is in
> + * the kernel porttion of the address space. PGDs
> + * are pre-populated and we never clear them.
> + */
> + WARN_ON_ONCE(1);
> + }
> + } else {
> + /*
> + * _PAGE_USER was not set in either the PGD being set or
> + * cleared. All kernel PGDs should be pre-populated so
> + * this should never happen after boot.
> + */
> + WARN_ON_ONCE(system_state == SYSTEM_RUNNING);
> + }
> +#endif
> + /* return the copy of the PGD we want the kernel to use: */
> + return pgd;
> +}
> +

I mentioned this earlier, but I think this should be:


VM_BUG_ON(pgdp points to a usermode table);

if (pgdp_maps_userspace(pgdp)) {
/* Install the pgd as requested into the usermode tables. */
kernel_to_user_pgdp(pgdp)->pgd = pgd.pgd;

if (pgd_val(pgd) & _PAGE_USER) {
/*
* This is a normal user pgd -- the kernelmode mapping should have NX
* set to prevent erroneous usermode execution with the kernel tables.
*/
return __pgd(pgd_val(pgd) | _PAGE_NX;
} else {
/* This is a weird mapping, e.g. EFI. Map it straight through. */
return pgd;
}
} else {
/*
* We can get here due to vmalloc, a vmalloc fault, memory
hot-add, or initial setup
* of kernelmode page tables. Regardless of which particular code
path we're in,
* these mappings should not be automatically propagated to the
usermode tables.
*/
return pgd;
}
}

That should make all the VSYSCALL nastiness go away.