To dampen the tradeoff, we could do this in shmem_fault() instead? But
then, this would mean that we do this in all
kinds of vma->vm_ops->fault, only when we discover another reference
count race condition :) Doing this in do_fault()
should solve this once and for all. In fact, do_pte_missing() may call
do_anonymous_page() or do_fault(), and I just
noticed that the former already checks this using vmf_pte_changed().
What I am still missing is why this is (a) arm64 only; and (b) if this
is something we should really worry about. There are other reasons
(e.g., speculative references) why migration could temporarily fail,
does it happen that often that it is really something we have to worry
about?
(a) See discussion at [1]; I guess it passes on x86, which is quite
strange since the race is clearly arch-independent.
(b) On my machine, on an average in under 10 iterations of move_pages(),
it fails, which seems problematic to