Re: [PATCH 04/15] mm/hmm: unregister mmu_notifier when last HMM client quit v2
From: Jerome Glisse
Date: Thu Mar 22 2018 - 20:50:25 EST
On Thu, Mar 22, 2018 at 05:13:14PM -0700, John Hubbard wrote:
> On 03/22/2018 04:37 PM, Jerome Glisse wrote:
> > On Thu, Mar 22, 2018 at 03:47:16PM -0700, John Hubbard wrote:
> >> On 03/21/2018 04:41 PM, Jerome Glisse wrote:
> >>> On Wed, Mar 21, 2018 at 04:22:49PM -0700, John Hubbard wrote:
> >>>> On 03/21/2018 11:16 AM, jglisse@xxxxxxxxxx wrote:
> >>>>> From: Jérôme Glisse <jglisse@xxxxxxxxxx>
>
> <snip>
>
> >>>
> >>> No this code is correct. hmm->mm is set after hmm struct is allocated
> >>> and before it is public so no one can race with that. It is clear in
> >>> hmm_mirror_unregister() under the write lock hence checking it here
> >>> under that same lock is correct.
> >>
> >> Are you implying that code that calls hmm_mirror_register() should do
> >> it's own locking, to prevent simultaneous calls to that function? Because
> >> as things are right now, multiple threads can arrive at this point. The
> >> fact that mirror->hmm is not "public" is irrelevant; what matters is that
> >>> 1 thread can change it simultaneously.
> >
> > The content of struct hmm_mirror should not be modified by code outside
> > HMM after hmm_mirror_register() and before hmm_mirror_unregister(). This
> > is a private structure to HMM and the driver should not touch it, ie it
> > should be considered as read only/const from driver code point of view.
>
> Yes, that point is clear and obvious.
>
> >
> > It is also expected (which was obvious to me) that driver only call once
> > and only once hmm_mirror_register(), and only once hmm_mirror_unregister()
> > for any given hmm_mirror struct. Note that driver can register multiple
> > _different_ mirror struct to same mm or differents mm.
> >
> > There is no need of locking on the driver side whatsoever as long as the
> > above rules are respected. I am puzzle if they were not obvious :)
>
> Those rules were not obvious. It's unusual to claim that register and unregister
> can run concurrently, but regiser and register cannot. Let's please document
> the rules a bit in the comments.
I am really surprise this was not obvious. All existing _register API
in the kernel follow this. You register something once only and doing
it twice for same structure (ie unique struct hmm_mirror *mirror pointer
value) leads to serious bugs (doing so concurently or not).
For instance if you call mmu_notifier_register() twice (concurrently
or not) with same pointer value for struct mmu_notifier *mn then bad
thing will happen. Same for driver_register() but this one actualy
have sanity check and complain loudly if that happens. I doubt there
is any single *_register/unregister() in the kernel that does not
follow this.
Note that doing register/unregister concurrently for the same unique
hmm_mirror struct is also illegal. However concurrent register and
unregister of different hmm_mirror struct is legal and this is the
reasons for races we were discussing.
Cheers,
Jérôme