Re: [PATCH 1/5] mm: Introduce mm_struct.has_pinned

From: Peter Xu
Date: Thu Sep 24 2020 - 14:34:28 EST


On Thu, Sep 24, 2020 at 03:15:01PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 24, 2020 at 01:55:31PM -0400, Peter Xu wrote:
> > On Thu, Sep 24, 2020 at 01:51:52PM -0300, Jason Gunthorpe wrote:
> > > > Regarding the solution here, I think we can also cover read-only fast-gup too
> > > > in the future - IIUC what we need to do is to make it pte_protnone() instead of
> > > > pte_wrprotect(), then in the fault handler we should identify this special
> > > > pte_protnone() against numa balancing (change_prot_numa()). I think it should
> > > > work fine too, iiuc, because I don't think we should migrate a page at all if
> > > > it's pinned for any reason...
> >
> > [1]
> >
> > >
> > > With your COW breaking patch the read only fast-gup should break the
> > > COW because of the write protect, just like for the write side. Not
> > > seeing why we need to do something more?
> >
> > Consider this sequence of a parent process managed to fork() a child:
> >
> > buf = malloc();

Sorry! I think I missed something like:

mprotect(buf, !WRITE);

Here.

> > // RDONLY gup
> > pin_user_pages(buf, !WRITE);
> > // pte of buf duplicated on both sides
> > fork();
> > mprotect(buf, WRITE);
> > *buf = 1;
> > // buf page replaced as cow triggered
> >
> > Currently when fork() we'll happily share a pinned read-only page with the
> > child by copying the pte directly.
>
> Why? This series prevents that, the page will be maybe_dma_pinned, so
> fork() will copy it.

With the extra mprotect(!WRITE), I think we'll see a !pte_write() entry. Then
it'll not go into maybe_dma_pinned() at all since cow==false.

>
> > As a summary: imho the important thing is we should not allow any kind of
> > sharing of any dma page, even it's pinned for read.
>
> Any sharing that results in COW. MAP_SHARED is fine, for instance

Oh right, MAP_SHARED is definitely special.

Thanks,

--
Peter Xu