Re: [PATCH v8 5/8] mm: Device exclusive memory access

From: Alistair Popple
Date: Wed May 19 2021 - 07:05:09 EST


On Wednesday, 19 May 2021 9:45:05 AM AEST Peter Xu wrote:
> External email: Use caution opening links or attachments
>
> On Tue, May 18, 2021 at 08:03:27PM -0300, Jason Gunthorpe wrote:
> > Logically during fork all these device exclusive pages should be
> > reverted back to their CPU pages, write protected and the CPU page PTE
> > copied to the fork.
> >
> > We should not copy the device exclusive page PTE to the fork. I think
> > I pointed to this on an earlier rev..
>
> Agreed. Though please see the question I posted in the other thread: now I
> am not very sure whether we'll be able to mark a page as device exclusive
> if that page has mapcount>1.
>
> > We can optimize this into the various variants above, but logically
> > device exclusive stop existing during fork.
>
> Makes sense, I think that's indeed what this patch did at least for the COW
> case, so I think Alistair did address that comment. It's just that I think
> we need to drop the other !COW case (imho that should correspond to the
> changes in copy_nonpresent_pte()) in this patch to guarantee it.

Right. The main change from v7 -> v8 was to remove device exclusive entries on
fork instead of copying them. The change in copy_nonpresent_pte() is for the
!COW case. I think what you are getting at is given exclusive entries are
(currently) only supported for PageAnon pages is_cow_mapping() will always be
true and therefore the change to copy_nonpresent_pte() can be dropped. That
logic seems reasonable so I will change the exclusive case in
copy_nonpresent_pte() to a VM_WARN_ON.

> I also hope we don't make copy_pte_range() even more complicated just to do
> the lock_page() right, so we could fail the fork() if the lock is hard to
> take.

Failing fork() because we couldn't take a lock doesn't seem like the right
approach though, especially as there is already existing code that retries. I
get this adds complexity though, so would be happy to take a look at cleaning
copy_pte_range() up in future.

> --
> Peter Xu