Re: [PATCH] mmu_notifier, kvm: Introduce dirty bit tracking in spteand mmu notifier to help KSM dirty bit tracking

From: Avi Kivity
Date: Wed Jun 22 2011 - 06:44:22 EST


On 06/21/2011 04:32 PM, Nai Xia wrote:
Introduced kvm_mmu_notifier_test_and_clear_dirty(), kvm_mmu_notifier_dirty_update()
and their mmu_notifier interfaces to support KSM dirty bit tracking, which brings
significant performance gain in volatile pages scanning in KSM.
Currently, kvm_mmu_notifier_dirty_update() returns 0 if and only if intel EPT is
enabled to indicate that the dirty bits of underlying sptes are not updated by
hardware.



Can you quantify the performance gains?

+int kvm_test_and_clear_dirty_rmapp(struct kvm *kvm, unsigned long *rmapp,
+ unsigned long data)
+{
+ u64 *spte;
+ int dirty = 0;
+
+ if (!shadow_dirty_mask) {
+ WARN(1, "KVM: do NOT try to test dirty bit in EPT\n");
+ goto out;
+ }
+
+ spte = rmap_next(kvm, rmapp, NULL);
+ while (spte) {
+ int _dirty;
+ u64 _spte = *spte;
+ BUG_ON(!(_spte& PT_PRESENT_MASK));
+ _dirty = _spte& PT_DIRTY_MASK;
+ if (_dirty) {
+ dirty = 1;
+ clear_bit(PT_DIRTY_SHIFT, (unsigned long *)spte);
+ }

Racy. Also, needs a tlb flush eventually.

+ spte = rmap_next(kvm, rmapp, spte);
+ }
+out:
+ return dirty;
+}
+
#define RMAP_RECYCLE_THRESHOLD 1000


struct mmu_notifier_ops {
+ int (*dirty_update)(struct mmu_notifier *mn,
+ struct mm_struct *mm);
+

I prefer to have test_and_clear_dirty() always return 1 in this case (if the spte is writeable), and drop this callback.
+int __mmu_notifier_dirty_update(struct mm_struct *mm)
+{
+ struct mmu_notifier *mn;
+ struct hlist_node *n;
+ int dirty_update = 0;
+
+ rcu_read_lock();
+ hlist_for_each_entry_rcu(mn, n,&mm->mmu_notifier_mm->list, hlist) {
+ if (mn->ops->dirty_update)
+ dirty_update |= mn->ops->dirty_update(mn, mm);
+ }
+ rcu_read_unlock();
+

Should it not be &= instead?

+ return dirty_update;
+}
+
/*
* This function can't run concurrently against mmu_notifier_register
* because mm->mm_users> 0 during mmu_notifier_register and exit_mmap

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/