[RFC PATCH 0/4] Restore change_pte optimization to its former glory

From: jglisse
Date: Thu Jan 31 2019 - 13:37:18 EST


From: JÃrÃme Glisse <jglisse@xxxxxxxxxx>

This patchset is on top of my patchset to add context information to
mmu notifier [1] you can find a branch with everything [2]. I have not
tested it but i wanted to get the discussion started. I believe it is
correct but i am not sure what kind of kvm test i can run to exercise
this.

The idea is that since kvm will invalidate the secondary MMUs within
invalidate_range callback then the change_pte() optimization is lost.
With this patchset everytime core mm is using set_pte_at_notify() and
thus change_pte() get calls then we can ignore the invalidate_range
callback altogether and only rely on change_pte callback.

Note that this is only valid when either going from a read and write
pte to a read only pte with same pfn, or from a read only pte to a
read and write pte with different pfn. The other side of the story
is that the primary mmu pte is clear with ptep_clear_flush_notify
before the call to change_pte.

Also with the mmu notifier context information [1] you can further
optimize other cases like mprotect or write protect when forking. You
can use the new context information to infer that the invalidation is
for read only update of the primary mmu and update the secondary mmu
accordingly instead of clearing it and forcing fault even for read
access. I do not know if that is an optimization that would bear any
fruit for kvm. It does help for device driver. You can also optimize
the soft dirty update.

Cheers,
JÃrÃme


[1] https://lore.kernel.org/linux-fsdevel/20190123222315.1122-1-jglisse@xxxxxxxxxx/T/#m69e8f589240e18acbf196a1c8aa1d6fc97bd3565
[2] https://cgit.freedesktop.org/~glisse/linux/log/?h=kvm-restore-change_pte

Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Cc: Peter Xu <peterx@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Matthew Wilcox <mawilcox@xxxxxxxxxxxxx>
Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Cc: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxxxx>
Cc: kvm@xxxxxxxxxxxxxxx

JÃrÃme Glisse (4):
uprobes: use set_pte_at() not set_pte_at_notify()
mm/mmu_notifier: use unsigned for event field in range struct
mm/mmu_notifier: set MMU_NOTIFIER_USE_CHANGE_PTE flag where
appropriate
kvm/mmu_notifier: re-enable the change_pte() optimization.

include/linux/mmu_notifier.h | 21 +++++++++++++++++++--
kernel/events/uprobes.c | 3 +--
mm/ksm.c | 6 ++++--
mm/memory.c | 3 ++-
virt/kvm/kvm_main.c | 16 ++++++++++++++++
5 files changed, 42 insertions(+), 7 deletions(-)

--
2.17.1