Re: [PATCH 04/15] mm/hmm: unregister mmu_notifier when last HMM client quit v2

From: John Hubbard
Date: Thu Mar 22 2018 - 20:56:25 EST


On 03/22/2018 05:50 PM, Jerome Glisse wrote:
> On Thu, Mar 22, 2018 at 05:13:14PM -0700, John Hubbard wrote:
>> On 03/22/2018 04:37 PM, Jerome Glisse wrote:
>>> On Thu, Mar 22, 2018 at 03:47:16PM -0700, John Hubbard wrote:
>>>> On 03/21/2018 04:41 PM, Jerome Glisse wrote:
>>>>> On Wed, Mar 21, 2018 at 04:22:49PM -0700, John Hubbard wrote:
>>>>>> On 03/21/2018 11:16 AM, jglisse@xxxxxxxxxx wrote:
>>>>>>> From: JÃrÃme Glisse <jglisse@xxxxxxxxxx>
>>
>> <snip>
>>
>>>>>
>>>>> No this code is correct. hmm->mm is set after hmm struct is allocated
>>>>> and before it is public so no one can race with that. It is clear in
>>>>> hmm_mirror_unregister() under the write lock hence checking it here
>>>>> under that same lock is correct.
>>>>
>>>> Are you implying that code that calls hmm_mirror_register() should do
>>>> it's own locking, to prevent simultaneous calls to that function? Because
>>>> as things are right now, multiple threads can arrive at this point. The
>>>> fact that mirror->hmm is not "public" is irrelevant; what matters is that
>>>>> 1 thread can change it simultaneously.
>>>
>>> The content of struct hmm_mirror should not be modified by code outside
>>> HMM after hmm_mirror_register() and before hmm_mirror_unregister(). This
>>> is a private structure to HMM and the driver should not touch it, ie it
>>> should be considered as read only/const from driver code point of view.
>>
>> Yes, that point is clear and obvious.
>>
>>>
>>> It is also expected (which was obvious to me) that driver only call once
>>> and only once hmm_mirror_register(), and only once hmm_mirror_unregister()
>>> for any given hmm_mirror struct. Note that driver can register multiple
>>> _different_ mirror struct to same mm or differents mm.
>>>
>>> There is no need of locking on the driver side whatsoever as long as the
>>> above rules are respected. I am puzzle if they were not obvious :)
>>
>> Those rules were not obvious. It's unusual to claim that register and unregister
>> can run concurrently, but regiser and register cannot. Let's please document
>> the rules a bit in the comments.
>
> I am really surprise this was not obvious. All existing _register API
> in the kernel follow this. You register something once only and doing
> it twice for same structure (ie unique struct hmm_mirror *mirror pointer
> value) leads to serious bugs (doing so concurently or not).
>
> For instance if you call mmu_notifier_register() twice (concurrently
> or not) with same pointer value for struct mmu_notifier *mn then bad
> thing will happen. Same for driver_register() but this one actualy
> have sanity check and complain loudly if that happens. I doubt there
> is any single *_register/unregister() in the kernel that does not
> follow this.

OK, then I guess no need to document it. In any case we know what to
expect here, so no problem there.

thanks,
--
John Hubbard
NVIDIA

>
> Note that doing register/unregister concurrently for the same unique
> hmm_mirror struct is also illegal. However concurrent register and
> unregister of different hmm_mirror struct is legal and this is the
> reasons for races we were discussing.
>
> Cheers,
> JÃrÃme
>