Re: [PATCH v5 1/2] mm: make dev_pagemap_mapping_shift() externally visible

From: Sean Christopherson
Date: Tue Dec 17 2019 - 19:18:59 EST


On Mon, Dec 16, 2019 at 12:59:53PM -0500, Barret Rhoden wrote:
> On 12/13/19 12:47 PM, Sean Christopherson wrote:
> >>+EXPORT_SYMBOL_GPL(dev_pagemap_mapping_shift);
> >
> >This is basically a rehash of lookup_address_in_pgd(), and doesn't provide
> >exactly what KVM needs. E.g. KVM works with levels instead of shifts, and
> >it would be nice to provide the pte so that KVM can sanity check that the
> >pfn from this walk matches the pfn it plans on mapping.
>
> One minor issue is that the levels for lookup_address_in_pgd() and for KVM
> differ in name, although not in value. lookup uses PG_LEVEL_4K = 1. KVM
> uses PT_PAGE_TABLE_LEVEL = 1. The enums differ a little too: x86 has a name
> for a 512G page, etc. It's all in arch/x86.
>
> Does KVM-x86 need its own names for the levels? If not, I could convert the
> PT_PAGE_TABLE_* stuff to PG_LEVEL_* stuff.

Not really. I suspect the whole reason KVM has different enums is to
handle PSE/Mode-B paging, where PG_LEVEL_2M would be inaccurate, e.g.:

if (PTTYPE == 32 && walker->level == PT_DIRECTORY_LEVEL && is_cpuid_PSE36())
gfn += pse36_gfn_delta(pte);

That being said, I absolute loathe PT_PAGE_TABLE_LEVEL, I can never
remember that it means 4k pages. I would be in favor of using the kernel's
enums with some KVM-specific abstraction of PG_LEVEL_2M, e.g.

/* KVM Hugepage definitions for x86 */
enum {
PG_LEVEL_2M_OR_4M = PG_LEVEL_2M,
/* set max level to the biggest one */
KVM_MAX_HUGEPAGE_LEVEL = PG_LEVEL_1G,
};

And ideally restrict usage of the ugly PG_LEVEL_2M_OR_4M to flows that can
actually encounter 4M pages, i.e. when walking guest page tables. On the
host side, KVM always uses PAE or 64-bit paging.

Personally, I'd pursue that in a separate patch/series, it'll touch a
massive amount of code and will probably get bikeshedded a fair amount.