Re: [PATCH 1/3] fs, mm: add ->cachestat() file operation
From: Amir Goldstein
Date: Tue Jun 23 2026 - 11:35:12 EST
On Tue, Jun 23, 2026 at 4:55 PM Pavel Tikhomirov
<ptikhomirov@xxxxxxxxxxxxx> wrote:
>
>
>
> On 6/23/26 15:48, Johannes Weiner wrote:
> > On Tue, Jun 23, 2026 at 01:14:48PM +0200, Pavel Tikhomirov wrote:
> >> The cachestat() syscall reads page cache statistics straight from the
> >> file's f_mapping. Stackable filesystems such as overlayfs keep the data
> >> pages in an underlying inode's mapping rather than in the overlay
> >> inode's, so cachestat() reports all zeroes for them.
> >>
> >> Add a ->cachestat() file operation and route the syscall through a new
> >> vfs_cachestat() helper that calls it when present, falling back to
> >> file's f_mapping otherwise. This lets stackable filesystems forward the
> >> query to the file that actually owns the page cache. No behaviour change
> >> for regular files.
> >>
> >> Signed-off-by: Pavel Tikhomirov <ptikhomirov@xxxxxxxxxxxxx>
> >> ---
> >> Note: Memset change might be a bit tricky, I moved it to no
> >> ->cachestat() path to avoid multiple memset on nested overlayfs, that
> >> means that ->cachestat() is expected to be able to handle unitialized
> >> cs.
> >> ---
> >> include/linux/fs.h | 10 ++++++++++
> >> mm/filemap.c | 43 +++++++++++++++++++++++++++++++++++--------
> >> 2 files changed, 45 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/include/linux/fs.h b/include/linux/fs.h
> >> index 6da44573ce450..966b6564707e4 100644
> >> --- a/include/linux/fs.h
> >> +++ b/include/linux/fs.h
> >> @@ -53,6 +53,8 @@
> >>
> >> struct bdi_writeback;
> >> struct bio;
> >> +struct cachestat_range;
> >> +struct cachestat;
> >> struct io_comp_batch;
> >> struct fiemap_extent_info;
> >> struct kiocb;
> >> @@ -1963,6 +1965,8 @@ struct file_operations {
> >> struct file *file_out, loff_t pos_out,
> >> loff_t len, unsigned int remap_flags);
> >> int (*fadvise)(struct file *, loff_t, loff_t, int);
> >> + int (*cachestat)(struct file *file, struct cachestat_range *csr,
> >> + struct cachestat *cs);
> >
> > I suppose you can't just have it return the real file because of the
> > with_ovl_creds() scope you need during access?
>
> Yes, AFAIU in overlay when we use realfile we should always use it
> with_ovl_creds(), even though I don't think there is anything cred related
> in filemap_cachestat(), I still think we should follow the common pattern
> other overlay helpers use (similar to ovl_fadvise() and ovl_flush()).
>
> note: Actually some places get ovl_real_file() and use it without
> with_ovl_creds(), e.g.: ovl_read_iter, ovl_write_iter, ovl_splice_read,
> ovl_splice_write. But those look more of an exception than the general
> rule. All other instances use with_ovl_creds().
Use with_ovl_creds() is a good practice to keep the mental security model,
but it is useless if the security check (can_do_cachestat) is not in the
vfs helper (vfs_cachestat), so please move it there.
Also it kind of makes more sense to check (flags != 0) in sys_cachestats
before checking permissions.
>
> Also there are simingly no other file_operations which return "realfile"
> for further processing, mostly the operation from fsops simply replaces
> general operation with its own logic completely.
>
> Thanks for your review!
>
> ps: Hope overlay maintainers will correctly if I'm getting this wrong.
>
I don't think this is wrong per-se, except for can_do_cachestat().
Just be aware that the real file could change from one cachestat
call to the next (i.e. due to copy up).
Thanks,
Amir.