Re: [kernel-hardening] [PATCH v5 04/10] arm64: Add __flush_tlb_one()

From: Tycho Andersen
Date: Wed Aug 23 2017 - 12:58:48 EST


Hi Mark,

On Mon, Aug 14, 2017 at 05:50:47PM +0100, Mark Rutland wrote:
> That said, is there any reason not to use flush_tlb_kernel_range()
> directly?

So it turns out that there is a difference between __flush_tlb_one() and
flush_tlb_kernel_range() on x86: flush_tlb_kernel_range() flushes all the TLBs
via on_each_cpu(), where as __flush_tlb_one() only flushes the local TLB (which
I think is enough here).

As you might expect, this is quite a performance hit (at least under kvm), I
ran a little kernbench:

# __flush_tlb_one
Wed Aug 23 15:47:33 UTC 2017
4.13.0-rc5+
Average Half load -j 2 Run (std deviation):
Elapsed Time 50.3233 (1.82716)
User Time 87.1233 (1.26871)
System Time 15.36 (0.500899)
Percent CPU 203.667 (4.04145)
Context Switches 7350.33 (1339.65)
Sleeps 16008.3 (980.362)

Average Optimal load -j 4 Run (std deviation):
Elapsed Time 27.4267 (0.215019)
User Time 88.6983 (1.91501)
System Time 13.1933 (2.39488)
Percent CPU 286.333 (90.6083)
Context Switches 11393 (4509.14)
Sleeps 15764.7 (698.048)

# flush_tlb_kernel_range()
Wed Aug 23 16:00:03 UTC 2017
4.13.0-rc5+
Average Half load -j 2 Run (std deviation):
Elapsed Time 86.57 (1.06099)
User Time 103.25 (1.85475)
System Time 75.4433 (0.415852)
Percent CPU 205.667 (3.21455)
Context Switches 9363.33 (1361.57)
Sleeps 14703.3 (1439.12)

Average Optimal load -j 4 Run (std deviation):
Elapsed Time 51.27 (0.615873)
User Time 110.328 (7.93884)
System Time 74.06 (1.55788)
Percent CPU 288 (90.2197)
Context Switches 16557.5 (7930.01)
Sleeps 14774.7 (921.746)

So, I think we need to keep something like __flush_tlb_one around.
I'll call it flush_one_local_tlb() for now, and will cc x86@ on the
next version to see if they have any insight.

Cheers,

Tycho