Re: [PATCH RFC v8 17/56] x86/fault: Add support to handle the RMP fault for user address

From: Vlastimil Babka
Date: Fri Mar 03 2023 - 10:32:08 EST


On 2/20/23 19:38, Michael Roth wrote:
> +static int handle_user_rmp_page_fault(struct pt_regs *regs, unsigned long error_code,
> + unsigned long address)
> +{
> + int rmp_level, level;
> + pgd_t *pgd;
> + pte_t *pte;
> + u64 pfn;
> +
> + pgd = __va(read_cr3_pa());
> + pgd += pgd_index(address);
> +
> + pte = lookup_address_in_pgd(pgd, address, &level);
> +
> + /*
> + * It can happen if there was a race between an unmap event and
> + * the RMP fault delivery.
> + */
> + if (!pte || !pte_present(*pte))
> + return RMP_PF_UNMAP;
> +
> + /*
> + * RMP page fault handler follows this algorithm:
> + * 1. Compute the pfn for the 4kb page being accessed
> + * 2. Read that RMP entry -- If it is assigned then kill the process
> + * 3. Otherwise, check the level from the host page table
> + * If level=PG_LEVEL_4K then the page is already smashed
> + * so just retry the instruction
> + * 4. If level=PG_LEVEL_2M/1G, then the host page needs to be split
> + */
> +
> + pfn = pte_pfn(*pte);
> +
> + /* If its large page then calculte the fault pfn */
> + if (level > PG_LEVEL_4K)
> + pfn = pfn | PFN_DOWN(address & (page_level_size(level) - 1));
> +
> + /*
> + * If its a guest private page, then the fault cannot be resolved.
> + * Send a SIGBUS to terminate the process.
> + *
> + * As documented in APM vol3 pseudo-code for RMPUPDATE, when the 2M range
> + * is covered by a valid (Assigned=1) 2M entry, the middle 511 4k entries
> + * also have Assigned=1. This means that if there is an access to a page
> + * which happens to lie within an Assigned 2M entry, the 4k RMP entry
> + * will also have Assigned=1. Therefore, the kernel should see that
> + * the page is not a valid page and the fault cannot be resolved.
> + */
> + if (snp_lookup_rmpentry(pfn, &rmp_level)) {
> + pr_info("Fatal RMP page fault, terminating process, entry assigned for pfn 0x%llx\n",
> + pfn);
> + do_sigbus(regs, error_code, address, VM_FAULT_SIGBUS);
> + return RMP_PF_RETRY;
> + }

WRT my reply to 12/56, for example here it might be useful to distinguish
the rmp being assigned from an error of snp_lookup_rmpentry()?

> +
> + /*
> + * The backing page level is higher than the RMP page level, request
> + * to split the page.
> + */
> + if (level > rmp_level)
> + return RMP_PF_SPLIT;
> +
> + return RMP_PF_RETRY;
> +}
> +
> /*