Re: [PATCH 01/13] mm: Update ptep_get_lockless()s comment

From: Linus Torvalds
Date: Sat Oct 29 2022 - 16:16:41 EST


On Sat, Oct 29, 2022 at 12:39 PM John Hubbard <jhubbard@xxxxxxxxxx> wrote:
>
> ext4 has since papered over the problem, by soldiering on if it finds a
> page without writeback buffers when it expected to be able to writeback
> a dirty page. But you get the idea.

I suspect that "soldiering on" is the right thing to do, but yes, our
'mkdirty' vs 'mkclean' thing has always been problematic.

I think we always needed a page lock for it, but PG_lock itself
doesn't work (as mentioned earlier) because the VM can't serialize
with IO, and needs the lock to basically be a spinlock.

The page table lock kind of took its place, and then the rmap removal
makes for problems (since it is what basically ends up being the
shared place to look it upo).

I can think of three options:

(a) filesystems just deal with it

(b) we could move the "page_remove_rmap()" into the "flush-and-free" path too

(c) we could actually add a spinlock (hashed on the page?) for this

I think (a) is basically our current expectation.

And (b) would be fairly easy - same model as that dirty bit patch,
just a 'do page_remove_rmap too' - except page_remove_rmap() wants the
vma as well (and we delay the TLB flush over multiple vma's, so it's
not just a "save vma in mmu_gather").

Doing (c) doesn't look hard, except for the "new lock" thing, which is
always a potential huge disaster. If it's only across set_page_dirty()
and page_mkclean(), though, and uses some simple page-based hash, it
sounds fairly benign.

Linus



Linus