Re: [PATCH v10 2/3] cachestat: implement cachestat syscall

From: Nhat Pham
Date: Fri Mar 03 2023 - 01:56:09 EST


On Sun, Feb 19, 2023 at 4:21 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> On Sat, Feb 18, 2023 at 11:33:17PM -0800, Nhat Pham wrote:
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index e654435f1651..83300f1491e7 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -75,6 +75,7 @@ struct fs_context;
> > struct fs_parameter_spec;
> > struct fileattr;
> > struct iomap_ops;
> > +struct cachestat;
> >
> > extern void __init inode_init(void);
> > extern void __init inode_init_early(void);
> > @@ -830,6 +831,8 @@ void filemap_invalidate_lock_two(struct address_space *mapping1,
> > struct address_space *mapping2);
> > void filemap_invalidate_unlock_two(struct address_space *mapping1,
> > struct address_space *mapping2);
> > +void filemap_cachestat(struct address_space *mapping, pgoff_t first_index,
> > + pgoff_t last_index, struct cachestat *cs);
>
> 1. Why is this in fs.h instead of pagemap.h?
>
> 2. Why is it not static, since it's only used by the syscall,
> which is also in filemap.c?
>
> > @@ -55,6 +56,9 @@
> > #include <linux/buffer_head.h> /* for try_to_free_buffers */
> >
> > #include <asm/mman.h>
> > +#include <uapi/linux/mman.h>
>
> I think this hunk should be:
>
> -#include <asm/mman.h>
> +#include <linux/mman.h>
>
> (linux/mman.h includes uapi/linux/mman.h, which includes asm/mman.h)
>
> > +/**
> > + * filemap_cachestat() - compute the page cache statistics of a mapping
> > + * @mapping: The mapping to compute the statistics for.
> > + * @first_index: The starting page cache index.
> > + * @last_index: The final page index (inclusive).
> > + * @cs: the cachestat struct to write the result to.
> > + *
> > + * This will query the page cache statistics of a mapping in the
> > + * page range of [first_index, last_index] (inclusive). The statistics
> > + * queried include: number of dirty pages, number of pages marked for
> > + * writeback, and the number of (recently) evicted pages.
> > + */
>
> Do we care that this isn't going to work for hugetlbfs?

I ran a quick test using hugetlbfs. It looks like the current
implementation is treating it in accordance to the multi-page
folio case we discussed earlier, i.e:

- Returned number of "pages" is in terms of the number of
base/small pages (i.e 512 dirty pages instead of 1 dirty
huge page etc.)
- If we touch one byte in the huge page, it would report the
entire huge page as dirty, but again in terms of the underlying
pages.

Is this what you have in mind, or is there another edge
case that I'm missing...?

Thanks for the comments and suggestions, Matthew!


>
> > + rcu_read_lock();
> > + xas_for_each(&xas, folio, last_index) {
> > + unsigned long nr_pages;
> > + pgoff_t folio_first_index, folio_last_index;
> > +
> > + if (xas_retry(&xas, folio))
> > + continue;
> > +
> > + nr_pages = folio_nr_pages(folio);
> > + folio_first_index = folio_pgoff(folio);
> > + folio_last_index = folio_first_index + nr_pages - 1;
> > +
> > + /* Folios might straddle the range boundaries, only count covered subpages */
>
> s/subpages/pages/
>