Re: Spooling large metadata updates / Proposal for a new API/feature in the Linux Kernel (VFS/Filesystems):

From: Matthew Wilcox
Date: Sun Jan 12 2025 - 06:59:06 EST


On Sun, Jan 12, 2025 at 12:27:43AM -0500, Theodore Ts'o wrote:
> So yes, it basically exists, although in practice, it doesn't work as
> well as you might think, because of the need to read potentially a
> large number of the metdata blocks. But for example, if you make sure
> that all of the inode information is already cached, e.g.:
>
> ls -lR /path/to/large/tree > /dev/null
>
> Then the operation to do a bulk update will be fast:
>
> time chown -R root:root /path/to/large/tree
>
> This demonstrates that the bottleneck tends to be *reading* the
> metdata blocks, not *writing* the metadata blocks.

So if we presented more of the operations to the kernel at once, it
could pipeline the reading of the metadata, providing a user-visible
win.

However, I don't know that we need a new user API to do it. This is
something that could be done in the "rm" tool; it has the information
it needs, and it's better to put heuristics like "how far to read ahead"
in userspace than the kernel.