Re: [RFC] mm/migrate: make sure folio_unlock() before folio_wait_writeback()

From: Pedro Falcato
Date: Fri Oct 03 2025 - 10:05:49 EST


(Adding ext4 list to CC)

On Thu, Oct 02, 2025 at 01:38:59PM +0200, David Hildenbrand wrote:
> > To simplify the scenario:
> >
>
> Just curious, where is the __folio_start_writeback() to complete the
> picture?
>
> > context X (wq worker) context Y (process context)
> >
> > migrate_pages_batch()
> > ext4_end_io_end() ...
> > ... migrate_folio_unmap()
> > ext4_get_inode_loc() ...
> > ... folio_lock() // hold the folio lock
> > bdev_getblk() ...
> > ... folio_wait_writeback() // wait forever
> > __find_get_block_slow()
> > ... ...
> > folio_lock() // wait forever
> > folio_unlock() migrate_folio_undo_src()
> > ...
> > ... folio_unlock() // never reachable
> > ext4_finish_bio()
> > ...
> > folio_end_writeback() // never reachable
> >
>
> But aren't you implying that it should from this point on be disallowed to
> call folio_wait_writeback() with the folio lock held? That sounds ... a bit
> wrong.
>
> Note that it is currently explicitly allowed: folio_wait_writeback()
> documents "If the folio is not locked, writeback may start again after
> writeback has finished.". So there is no way to prevent writeback from
> immediately starting again.
>
> In particular, wouldn't we have to fixup other callsites to make this
> consistent and then VM_WARN_ON_ONCE() assert that in folio_wait_writeback()?
>
> Of course, as we've never seen this deadlock before in practice, I do wonder
> if something else prevents it?

As far as I can tell, the folio under writeback and the folio that
__find_get_block() finds will _never_ be the same. ext4_end_io_end() is
called for pages in an inode's address_space, and bdev_getblk() is called for
metadata blocks in block cache. Having an actual deadlock here would mean
that the folio is somehow both in an inode's address_space, and in the block
cache, I think? Also, AFAIK there is no way a folio can be removed from the
page cache while under writeback.

In any case, I added linux-ext4 so they can tell me how right/wrong I am.

--
Pedro