Re: [PATCH] docs: x86: Remove obsolete information about x86_64 vmalloc() faulting

From: Joerg Roedel
Date: Mon Jul 19 2021 - 08:34:37 EST


Hi,

On Fri, Jul 16, 2021 at 02:09:58AM -0400, Peilin Ye wrote:
> This information is out-of-date, and it took me quite some time of
> ftrace'ing before I figured it out... I think it would be beneficial to
> update, or at least remove it.
>
> As a proof that I understand what I am talking about, on my x86_64 box:
>
> 1. I allocated a vmalloc() area containing linear address `addr`;
> 2. I manually pagewalked `addr` in different page tables, including
> `init_mm.pgd`;
> 3. The corresponding PGD entries for `addr` in different page tables,
> they all immediately pointed at the same PUD table (my box uses
> 4-level paging), at the same physical address;
> 4. No "lazy synchronization" via page fault handling happened at all,
> since it is the same PUD table pre-allocated by
> preallocate_vmalloc_pages() during boot time.

Yes, this is the story for x86-64, because all PUD/P4D pages for the vmalloc
area are pre-allocated at boot. So no faulting or synchronization needs
to happen.

On x86-32 this is a bit different. Pre-allocation of PMD/PTE pages is
not an option there (even less when 4MB large-pages with 2-level paging
come into the picture).

So what happens there is that vmalloc related changes to the init_mm.pgd
are synchronized to all page-tables in the system. But this
synchronization is subject to race conditions in a way that another CPU
might vmalloc an area below a PMD which is not fully synchronized yet.

When this happens there is a fault, which is handled as a vmalloc()
fault on x86-32 just as before. So vmalloc faults still exist on 32-bit,
they are just less likely as they used to be.

Regards,

Joerg