Re: [PATCH 01/13] mm: Update ptep_get_lockless()s comment

From: John Hubbard
Date: Sat Oct 29 2022 - 16:42:36 EST


On 10/29/22 13:30, Linus Torvalds wrote:
>> I can think of three options:
>>
>> (a) filesystems just deal with it
>>
>> (b) we could move the "page_remove_rmap()" into the "flush-and-free" path too
>>
>> (c) we could actually add a spinlock (hashed on the page?) for this
>>
>> I think (a) is basically our current expectation.
>
> Side note: anybody doing gup + set_page_dirty() won't be fixed by b/c
> anyway, so I think (a) is basically the only thing.
>
> And that's true even if you do a page pinning gup, since the source of
> the gup may be actively unmapped after the gup.

I was just now writing a response that favored (c) over (b), precisely
because of that, yes. :)

>
> So a filesystem that thinks that only write, or a rmap-accessible mmap
> can turn the page dirty really seems to be fundamentally broken.
>
> And I think that has always been the case, it's just that filesystem
> writers may not have been happy with it, and may not have had
> test-cases for it.
>
> It's not surprising that the filesystem people then try to blame users.
>
> Linus

Yes, lots of unhappy debates about this over the years.

However, I remain intrigued by (c), because if we had a "dirty page lock"
that is looked up by page (much like looking up the ptl), it seems like
a building block that would potentially help solve the whole thing.

The above points about "file system needs to coordinate with mm about
what's allowed to be dirtied, including gup/dma cases", those are still
true and not yet solved, yes. But having a solid point of synchronization
for this, definitely looks interesting.

Of course, without working through this more thoroughly, it's not fair
to impose this constraint on the current discussion, understood. :)

thanks,
--
John Hubbard
NVIDIA