Re: [PATCH v3 2/3] userfaultfd: UFFDIO_MOVE uABI

From: Peter Xu
Date: Thu Oct 19 2023 - 16:45:06 EST


On Thu, Oct 19, 2023 at 01:02:39PM -0700, Suren Baghdasaryan wrote:
> Hi Folks,
> Sorry, I'm just catching up on all the comments in this thread after a

Not a problem.

> week-long absence. Will be addressing other questions separately but
> for cross-mm one, I think the best way forward would be for me to
> split this patch into two with the second one adding cross-mm support.
> That will clearly show how much additional code that requires and will
> make it easier for us to decide whether to support it or not.

Sounds good, thanks for that extra work.

> TBH, I don't see the need for an additional flag even if the initial
> version will be merged without cross-mm support. Once it's added the
> manpage can mention that starting with a specific Linux version
> cross-mm is supported, no?

It's about how an user app knows what the kernel supports.

On kernels that only support single-mm, UFFDIO_MOVE should fail if it found
ctx->mm != current->mm.

I think the best way to let the user app be clear of what happened is one
new feature bit if cross-mm will be supported separately. Or the userapp
will need to rely on a specific failure code of UFFDIO_MOVE, and only until
the 1st MOVE being triggered. Not as clear, IMHO.

> Also from my quick read, it sounds like we want to prevent movements
> of pinned pages regardless of cross-mm support. Is my understanding
> correct?

I prefer that, but that's only my 2 cents. I just don't see how remap can
work with pin. IIUC pin is about coherency of processor view and DMA view.
Then if so the VA is the only identifier of a "page" for an user app
because real pfn is hidden, and remap changes that VA. So it doesn't make
sense to me to remap a pin in whatever form.

For check pinning: I think I used to mention that it may again require
proper locking over mm.write_protect_seq like fork() paths. No, when
thinking again I think I was wrong.. write_protect_seq requires mmap write
lock, definitely not good.

We can do what David mentioned before, after ptep_clear_flush() (so pte is
cleared) we recheck page pinning, if pinned fail MOVE and put the page
back. Note that we can't do that check after installing it into dest
pgtables, because then someone can start to pin it from dest mm already.

Thanks,

--
Peter Xu