Re: [PATCH 04/15] mm/hmm: unregister mmu_notifier when last HMM client quit v2

From: John Hubbard
Date: Thu Mar 22 2018 - 20:13:29 EST


On 03/22/2018 04:37 PM, Jerome Glisse wrote:
> On Thu, Mar 22, 2018 at 03:47:16PM -0700, John Hubbard wrote:
>> On 03/21/2018 04:41 PM, Jerome Glisse wrote:
>>> On Wed, Mar 21, 2018 at 04:22:49PM -0700, John Hubbard wrote:
>>>> On 03/21/2018 11:16 AM, jglisse@xxxxxxxxxx wrote:
>>>>> From: JÃrÃme Glisse <jglisse@xxxxxxxxxx>

<snip>

>>>
>>> No this code is correct. hmm->mm is set after hmm struct is allocated
>>> and before it is public so no one can race with that. It is clear in
>>> hmm_mirror_unregister() under the write lock hence checking it here
>>> under that same lock is correct.
>>
>> Are you implying that code that calls hmm_mirror_register() should do
>> it's own locking, to prevent simultaneous calls to that function? Because
>> as things are right now, multiple threads can arrive at this point. The
>> fact that mirror->hmm is not "public" is irrelevant; what matters is that
>>> 1 thread can change it simultaneously.
>
> The content of struct hmm_mirror should not be modified by code outside
> HMM after hmm_mirror_register() and before hmm_mirror_unregister(). This
> is a private structure to HMM and the driver should not touch it, ie it
> should be considered as read only/const from driver code point of view.

Yes, that point is clear and obvious.

>
> It is also expected (which was obvious to me) that driver only call once
> and only once hmm_mirror_register(), and only once hmm_mirror_unregister()
> for any given hmm_mirror struct. Note that driver can register multiple
> _different_ mirror struct to same mm or differents mm.
>
> There is no need of locking on the driver side whatsoever as long as the
> above rules are respected. I am puzzle if they were not obvious :)

Those rules were not obvious. It's unusual to claim that register and unregister
can run concurrently, but regiser and register cannot. Let's please document
the rules a bit in the comments.

>
> Note that the above rule means that for any given struct hmm_mirror their
> can only be one and only one call to hmm_mirror_register() happening, no
> concurrent call. If you are doing the latter then something is seriously
> wrong in your design.
>
> So to be clear on what variable are you claiming race ?
> mirror->hmm ?
> mirror->hmm->mm which is really hmm->mm (mirror part does not matter) ?
>
> I will hold resending v4 until tomorrow morning (eastern time) so that
> you can convince yourself that this code is right or prove me wrong.

No need to wait. The documentation request above is a minor point, and
we're OK with you resending v4 whenever you're ready.

thanks,
--
John Hubbard
NVIDIA