[PATCH v3 3/3] x86/mm: If INVPCID is available, use it to flush global mappings

From: Andy Lutomirski
Date: Fri Jan 29 2016 - 14:43:18 EST


On my Skylake laptop, INVPCID function 2 (flush absolutely
everything) takes about 376ns, whereas saving flags, twiddling
CR4.PGE to flush global mappings, and restoring flags takes about
539ns.

Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
---
arch/x86/include/asm/tlbflush.h | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 8b576832777e..985f1162c505 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -152,6 +152,15 @@ static inline void __native_flush_tlb_global(void)
{
unsigned long flags;

+ if (static_cpu_has_safe(X86_FEATURE_INVPCID)) {
+ /*
+ * Using INVPCID is considerably faster than a pair of writes
+ * to CR4 sandwiched inside an IRQ flag save/restore.
+ */
+ invpcid_flush_all();
+ return;
+ }
+
/*
* Read-modify-write to CR4 - protect it from preemption and
* from interrupts. (Use the raw variant because this code can
--
2.5.0