Re: [PATCH 0/3] x86: Make 5-level paging support unconditional for x86-64

From: Shivank Garg
Date: Wed Jul 31 2024 - 04:58:01 EST


I did some experiments to understand the impact of making 5 level page tables
default.
Machine Info: AMD Zen 4 EPYC server (2-socket system, 128 cores and 1 NUMA
node per socket, SMT Enabled)
Size of each NUMA node is approx 377 GB.

For experiments, I'm binding the benchmark to CPUs and memory nodes of single
socket for consistent results. Measured by enabling/disabling 5level Page
table using CONFIG_X86_5LEVEL.

% Change: (5L-4L)/4L*100
CoV (%): Coefficient of Variation (%)

Results:

lmbench:lat_pagefault: Metric- page-fault time (us) - Lower is better
4-Level PT 5-Level PT % Change
THP-never Mean:0.4068 Mean:0.4294 5.56
95% CI:0.4057-0.4078 95% CI:0.4287-0.4302

THP-Always Mean: 0.4061 Mean: 0.4288 % Change
95% CI: 0.4051-0.4071 95% CI: 0.4281-0.4295 5.59


Btree (Thread:32): Metric- Time Taken (in seconds) - Lower is better
4-Level 5-Level
Time Taken(s) CoV (%) Time Taken(s) CoV(%) % Change
THP Never 382.2 0.219 388.8 1.019 1.73
THP Madvise 383.0 0.261 384.8 0.809 0.47
THP Always 392.8 1.376 386.4 2.147 -1.63

Btree (Thread:256): Metric- Time Taken (in seconds) - Lower is better
4-Level 5-Level
Time Taken(s) CoV (%) Time Taken(s) CoV(%) % Change
THP Never 56.6 2.014 55.2 0.810 -2.47
THP Madvise 56.6 2.014 56.4 2.022 -0.35
THP Always 56.6 0.968 56.2 1.489 -0.71


Ebizzy: Metric- records/s - Higher is better
4-Level 5-Level
Threads record/s CoV (%) record/s CoV(%) % Change
1 844 0.302 837 0.196 -0.85
256 10160 0.315 10288 1.081 1.26


XSBench (Thread:256, THP:Never) - Higher is better
Metric 4-Level 5-Level % Change
Lookups/s 13720556 13396288 -2.36
CoV (%) 1.726 1.317


Hashjoin (Thread:256, THP:Never) - Lower is better
Metric 4-Level 5-Level % Change
Time taken(s) 424.4 427.4 0.707
CoV (%) 0.394 0.209


Graph500(Thread:256, THP:Madvise) - Lower is better
Metric 4-Level 5-Level % Change
Time Taken(s) 0.1879 0.1873 -0.32
CoV (%) 0.165 0.213


GUPS(Thread:128, THP:Madvise) - Higher is better
Metric 4-Level 5-Level % Change
GUPS 1.3265 1.3252 -0.10
CoV (%) 0.037 0.027


pagerank(Thread:256, THP:Madvise) - Lower is better
Metric 4-Level 5-Level % Change
Time taken(s) 143.67 143.67 0.00
CoV (%) 0.402 0.402


Redis(Thread:256, THP:Madvise) - Higher is better
Metric 4-Level 5-Level % Change
Throughput(Ops/s) 141030744 139586376 -1.02
CoV (%) 0.372 0.561


memcached(Thread:256, THP:Madvise) - Higher is better
Metric 4-Level 5-Level % Change
Throughput(Ops/s) 19916313 19743637 -0.87
CoV (%) 0.051 0.095


Inference:
5-level page table shows increase in page-fault latency but it does
not significantly impact other benchmarks.


Thanks,
Shivank