On Wed, Apr 19, 2023 at 03:24:55PM -0300, Jason Gunthorpe wrote:
On Wed, Apr 19, 2023 at 07:23:00PM +0100, Matthew Wilcox wrote:
On Wed, Apr 19, 2023 at 07:18:26PM +0100, Lorenzo Stoakes wrote:
So even if I did the FOLL_ALLOW_BROKEN_FILE_MAPPING patch series first, I
would still need to come along and delete a bunch of your code
afterwards. And unfortunately Pavel's recent change which insists on not
having different vm_file's across VMAs for the buffer would have to be
reverted so I expect it might not be entirely without discussion.
I don't even understand why Pavel wanted to make this change. The
commit log really doesn't say.
commit edd478269640
Author: Pavel Begunkov <asml.silence@xxxxxxxxx>
Date: Wed Feb 22 14:36:48 2023 +0000
io_uring/rsrc: disallow multi-source reg buffers
If two or more mappings go back to back to each other they can be passed
into io_uring to be registered as a single registered buffer. That would
even work if mappings came from different sources, e.g. it's possible to
mix in this way anon pages and pages from shmem or hugetlb. That is not
a problem but it'd rather be less prone if we forbid such mixing.
Cc: <stable@xxxxxxxxxxxxxxx>
Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx>
Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
It even says "That is not a problem"! So why was this patch merged
if it's not fixing a problem?
It's now standing in the way of an actual cleanup. So why don't we
revert it? There must be more to it than this ...
https://lore.kernel.org/all/61ded378-51a8-1dcb-b631-fda1903248a9@xxxxxxxxx/
So um, it's disallowed because Pavel couldn't understand why it
should be allowed? This gets less and less convincing.
FWIW, what I was suggesting was that we should have a FOLL_SINGLE_VMA
flag, which would use our shiny new VMA lock infrastructure to look
up and lock _one_ VMA instead of having the caller take the mmap_lock.
Passing that flag would be a tighter restriction that Pavel implemented,
but would certainly relieve some of his mental load.
By the way, even if all pages are from the same VMA, they may still be a
mixture of anon and file pages; think a MAP_PRIVATE of a file when
only some pages have been written to. Or an anon MAP_SHARED which is
accessible by a child process.