Re: [PATCH v7 5/8] mm: Device exclusive memory access

From: Alistair Popple
Date: Wed Mar 31 2021 - 22:21:36 EST


On Thursday, 1 April 2021 11:48:13 AM AEDT Jason Gunthorpe wrote:
> On Thu, Apr 01, 2021 at 11:45:57AM +1100, Alistair Popple wrote:
> > On Thursday, 1 April 2021 12:46:04 AM AEDT Jason Gunthorpe wrote:
> > > On Thu, Apr 01, 2021 at 12:27:52AM +1100, Alistair Popple wrote:
> > > > On Thursday, 1 April 2021 12:18:54 AM AEDT Jason Gunthorpe wrote:
> > > > > On Wed, Mar 31, 2021 at 11:59:28PM +1100, Alistair Popple wrote:
> > > > >
> > > > > > I guess that makes sense as the split could go either way at the
> > > > > > moment but I should add a check to make sure this isn't used with
> > > > > > pinned pages anyway.
> > > > >
> > > > > Is it possible to have a pinned page under one of these things? If I
> > > > > pin it before you migrate it then it remains pinned but hidden under
> > > > > the swap entry?
> > > >
> > > > At the moment yes. But I had planned (and this reminded me) to add a
check
> > to
> > > > prevent marking pinned pages for exclusive access.
> > >
> > > How do you even do that without races with GUP fast?
> >
> > Unless I've missed something I think I've convinced myself it should be
safe
> > to do the pin check after make_device_exclusive() has replaced all the
PTEs
> > with exclusive entries.
> >
> > GUP fast sequence:
> > 1. Read PTE
> > 2. Pin page
> > 3. Check PTE
> > 4. if PTE changed -> unpin and fallback
> >
> > If make_device_exclusive() runs after (1) it will either succeed or see
the
> > pin from (2) and fail (as desired). GUP should always see the PTE change
and
> > fallback which will revoke the exclusive access.
>
> AFAICT the user can trigger fork at that instant and fork will try to
> copy the desposited migration entry before it has been checked

In that case the child will get a read-only exclusive entry and eventually a
page copy via do_wp_page() and GUP will fallback (or fail in the case of fast
only) so the parent's exclusive entry will get removed before the page can be
pinned and therefore shouldn't split the wrong way.

But that is sounding rather complex, and I am not convinced I haven't missed a
corner case. It also seems like it shouldn't be necessary to copy exclusive
entries anyway. I could just remove them and restore the original entry, which
would be far simpler.

> Jason
>