Re: [RFC PATCH 0/7] mm: Get rid of vmalloc_sync_(un)mappings()

From: Joerg Roedel
Date: Sun May 10 2020 - 04:15:26 EST


On Sat, May 09, 2020 at 10:05:43PM -0700, Andy Lutomirski wrote:
> On Sat, May 9, 2020 at 2:57 PM Joerg Roedel <joro@xxxxxxxxxx> wrote:
> I spent some time looking at the code, and I'm guessing you're talking
> about the 3-level !SHARED_KERNEL_PMD case. I can't quite figure out
> what's going on.
>
> Can you explain what is actually going on that causes different
> mms/pgds to have top-level entries in the kernel range that point to
> different tables? Because I'm not seeing why this makes any sense.

There are three cases where the PMDs are not shared on x86-32:

1) With non-PAE the top-level is already the PMD level, because
the page-table only has two levels. Since the top-level can't
be shared, the PMDs are also not shared.

2) For some reason Xen-PV also does not share kernel PMDs on PAE
systems, but I havn't looked into why.

3) On 32-bit PAE with PTI enabled the kernel address space
contains the LDT mapping, which is also different per-PGD.
There is one PMD entry reserved for the LDT, giving it 2MB of
address space. I implemented it this way to keep the 32-bit
implementation of PTI mostly similar to the 64-bit one.

> Why does it need to be partitioned at all? The only thing that comes
> to mind is that the LDT range is per-mm. So I can imagine that the
> PAE case with a 3G user / 1G kernel split has to have the vmalloc
> range and the LDT range in the same top-level entry. Yuck.

PAE with 3G user / 1G kernel has _all_ of the kernel mappings in one
top-level entry (direct-mapping, vmalloc, ldt, fixmap).

> If it's *just* the LDT that's a problem, we could plausibly shrink the
> user address range a little bit and put the LDT in the user portion.
> I suppose this could end up creating its own set of problems involving
> tracking which code owns which page tables.

Yeah, for the PTI case it is only the LDT that causes the unshared
kernel PMDs, but even if we move the LDT somewhere else we still have
two-level paging and the xen-pv case.


Joerg