Re: [PATCH v7 04/11] readahead: allocate folios with mapping_min_order in readahead
From: Pankaj Raghav (Samsung)
Date: Mon Jun 17 2024 - 12:05:05 EST
On Mon, Jun 17, 2024 at 01:32:42PM +0100, Matthew Wilcox wrote:
> On Fri, Jun 14, 2024 at 09:26:02AM +0000, Pankaj Raghav (Samsung) wrote:
> > > Hm, but we don't have a reference on this folio. So this isn't safe.
> >
> > That is why I added a check for mapping after read_pages(). You are
> > right, we can make it better.
>
> That's not enoughh.
>
> > > > + if (mapping != folio->mapping)
> > > > + nr_pages = min_nrpages;
> > > > +
> > > > + VM_BUG_ON_FOLIO(nr_pages < min_nrpages, folio);
> > > > + ractl->_index += nr_pages;
> > >
> > > Why not just:
> > > ractl->_index += min_nrpages;
> >
> > Then we will only move min_nrpages even if the folio we found had a
> > bigger order. Hannes patches (first patch) made sure we move the
> > ractl->index by folio_nr_pages instead of 1 and making this change will
> > defeat the purpose because without mapping order set, min_nrpages will
> > be 1.
>
> Hannes' patch is wrong. It's not safe to call folio_nr_pages() unless
> you have a reference to the folio.
>
> > @@ -266,10 +266,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
> > * alignment constraint in the page cache.
> > *
> > */
> > - if (mapping != folio->mapping)
> > - nr_pages = min_nrpages;
> > + nr_pages = max(folio_nr_pages(folio), (long)min_nrpages);
>
> No.
>
> > Now we will still move respecting the min order constraint but if we had
> > a bigger folio and we do have a reference, then we move folio_nr_pages.
>
> You don't have a reference, so it's never safe.
I am hitting my head now because you have literally mentioned that in
the comment:
* next batch. This page may be the one we would
* have intended to mark as Readahead, but we don't
* have a stable reference to this page, and it's
* not worth getting one just for that.
I will move it by min_nrpages as follows:
> ractl->_index += min_nrpages;
So the following can still be there from Hannes patch as we have a
stable reference:
ractl->_workingset |= folio_test_workingset(folio);
- ractl->_nr_pages++;
+ ractl->_nr_pages += folio_nr_pages(folio);
+ i += folio_nr_pages(folio);
}
Thanks for the clarification.
--
Pankaj