Re: [PATCH 01/10] mm/hmm: use reference counting for HMM struct

From: John Hubbard
Date: Wed Feb 20 2019 - 19:07:04 EST


On 2/20/19 3:59 PM, Jerome Glisse wrote:
On Wed, Feb 20, 2019 at 03:47:50PM -0800, John Hubbard wrote:
On 1/29/19 8:54 AM, jglisse@xxxxxxxxxx wrote:
From: JÃrÃme Glisse <jglisse@xxxxxxxxxx>

Every time i read the code to check that the HMM structure does not
vanish before it should thanks to the many lock protecting its removal
i get a headache. Switch to reference counting instead it is much
easier to follow and harder to break. This also remove some code that
is no longer needed with refcounting.

Hi Jerome,

That is an excellent idea. Some review comments below:

[snip]

static int hmm_invalidate_range_start(struct mmu_notifier *mn,
const struct mmu_notifier_range *range)
{
struct hmm_update update;
- struct hmm *hmm = range->mm->hmm;
+ struct hmm *hmm = hmm_get(range->mm);
+ int ret;
VM_BUG_ON(!hmm);
+ /* Check if hmm_mm_destroy() was call. */
+ if (hmm->mm == NULL)
+ return 0;

Let's delete that NULL check. It can't provide true protection. If there
is a way for that to race, we need to take another look at refcounting.

I will do a patch to delete the NULL check so that it is easier for
Andrew. No need to respin.

(Did you miss my request to make hmm_get/hmm_put symmetric, though?)


Is there a need for mmgrab()/mmdrop(), to keep the mm around while HMM
is using it?

It is already the case. The hmm struct holds a reference on the mm struct
and the mirror struct holds a reference on the hmm struct hence the mirror
struct holds a reference on the mm through the hmm struct.



OK, good. Yes, I guess the __mmu_notifier_register() call in hmm_register()
should get an mm_struct reference for us.


/* FIXME support hugetlb fs */
if (is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL) ||
vma_is_dax(vma)) {
hmm_pfns_special(range);
+ hmm_put(hmm);
return -EINVAL;
}
@@ -910,6 +958,7 @@ int hmm_vma_fault(struct hmm_range *range, bool block)
* operations such has atomic access would not work.
*/
hmm_pfns_clear(range, range->pfns, range->start, range->end);
+ hmm_put(hmm);
return -EPERM;
}
@@ -945,7 +994,16 @@ int hmm_vma_fault(struct hmm_range *range, bool block)
hmm_pfns_clear(range, &range->pfns[i], hmm_vma_walk.last,
range->end);
hmm_vma_range_done(range);
+ hmm_put(hmm);
+ } else {
+ /*
+ * Transfer hmm reference to the range struct it will be drop
+ * inside the hmm_vma_range_done() function (which _must_ be
+ * call if this function return 0).
+ */
+ range->hmm = hmm;

Is that thread-safe? Is there anything preventing two or more threads from
changing range->hmm at the same time?

The range is provided by the driver and the driver should not change
the hmm field nor should it use the range struct in multiple threads.
If the driver do stupid things there is nothing i can do. Note that
this code is removed latter in the serie.

Cheers,
JÃrÃme


OK, I see. That sounds good.


thanks,
--
John Hubbard
NVIDIA