Re: [PATCH] mm/migrate: fix deadlock in migrate_pages_batch() on large folios

From: Andrew Morton
Date: Sun Jul 28 2024 - 15:50:14 EST


On Sun, 28 Jul 2024 23:49:13 +0800 Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:

> Currently, migrate_pages_batch() can lock multiple locked folios
> with an arbitrary order. Although folio_trylock() is used to avoid
> deadlock as commit 2ef7dbb26990 ("migrate_pages: try migrate in batch
> asynchronously firstly") mentioned, it seems try_split_folio() is
> still missing.

Am I correct in believing that folio_lock() doesn't have lockdep coverage?

> It was found by compaction stress test when I explicitly enable EROFS
> compressed files to use large folios, which case I cannot reproduce with
> the same workload if large folio support is off (current mainline).
> Typically, filesystem reads (with locked file-backed folios) could use
> another bdev/meta inode to load some other I/Os (e.g. inode extent
> metadata or caching compressed data), so the locking order will be:

Which kernels need fixing. Do we expect that any code paths in 6.10 or
earlier are vulnerable to this?