Re: [RFC PATCH 01/26] block: bdev: blockdev page cache is movable
From: Matthew Wilcox
Date: Wed Apr 19 2023 - 00:08:02 EST
On Tue, Apr 18, 2023 at 03:12:48PM -0400, Johannes Weiner wrote:
> While inspecting page blocks for the type of pages in them, I noticed
> a large number of blockdev cache in unmovable blocks. However, these
> pages are actually on the LRU, and their mapping has a .migrate_folio
> callback; they can be reclaimed and compacted as necessary.
Wise to split this out into a separate patch. Perhaps we can get it
into -next for a while to shake out any problems with it. I don't have
any specific code that I think is broken, but code like this is in ext2:
bh = sb_bread(sb, logic_sb_block);
es = (struct ext2_super_block *) (((char *)bh->b_data) + offset);
sbi->s_es = es;
ie it reads into the page cache and then keeps a pointer to it during
the lifetime of the mount.
This specific example is, I believe, safe. There's a refcount on
the buffer (released by brelse() at unmount) and so the page cannot
be migrated.
But that speaks to a different problem; sometimes buffers are held
pinned for short periods of time (eg reading a directory, modifying a
bitmap) and other times they're held pinned for a long period of time
(a superblock). We notice that pages are being long-term pinned (eg
GUP) and migrate them out of the MOVABLE zone when that happens to them.
Perhaps we need something similar for buffer heads where the filesystem
can specify if it's just having a quick look or if it intends for this
buffer to be pinned over, let's say, a return to userspace.