Re: [PATCH 0/3] x86: Make 5-level paging support unconditional for x86-64

From: Kirill A. Shutemov
Date: Wed Jul 31 2024 - 07:37:03 EST


On Wed, Jul 31, 2024 at 11:15:05AM +0200, Thomas Gleixner wrote:
> On Wed, Jul 31 2024 at 14:27, Shivank Garg wrote:
> > lmbench:lat_pagefault: Metric- page-fault time (us) - Lower is better
> > 4-Level PT 5-Level PT % Change
> > THP-never Mean:0.4068 Mean:0.4294 5.56
> > 95% CI:0.4057-0.4078 95% CI:0.4287-0.4302
> >
> > THP-Always Mean: 0.4061 Mean: 0.4288 % Change
> > 95% CI: 0.4051-0.4071 95% CI: 0.4281-0.4295 5.59
> >
> > Inference:
> > 5-level page table shows increase in page-fault latency but it does
> > not significantly impact other benchmarks.
>
> 5% regression on lmbench is a NONO.

Yeah, that's a biggy.

In our testing (on Intel HW) we didn't see any significant difference
between 4- and 5-level paging. But we were focused on TLB fill latency.
In both bare metal and in VMs. Maybe something wrong in the fault path?

It requires a closer look.

Shivank, could you share how you run lat_pagefault? What file size? How
parallel you run it?...

It would also be nice to get perf traces. Maybe it is purely SW issue.

> 5-level page tables add a cost in every hardware page table walk. That's
> a matter of fact and there is absolutely no reason to inflict this cost
> on everyone.
>
> The solution to this to make the 5-level mechanics smarter by evaluating
> whether the machine has enough memory to require 5-level tables and
> select the depth at boot time.

Let's understand the reason first.

The risk with your proposal is that 5-level paging will not get any
testing and rot over time.

I would like to keep it on, if possible.

--
Kiryl Shutsemau / Kirill A. Shutemov