Re: [PATCH v3 0/6] add mTHP support for anonymous shmem
From: Daniel Gomez
Date: Tue Jun 04 2024 - 04:19:06 EST
On Fri, May 31, 2024 at 01:13:48PM +0200, David Hildenbrand wrote:
> > >
> > > As a default, we should not be using large folios / mTHP for any shmem,
> > > just like we did with THP via shmem_enabled. This is what this series
> > > currently does, and is aprt of the whole mTHP user-space interface design.
> > >
> > > Further, the mTHP controls should control all of shmem, not only
> > > "anonymous shmem".
> >
> > Yes, that's what I thought and in my TODO list.
>
> Good, it would be helpful to coordinate with Daniel and Pankaj.
I've integrated patches 11 and 12 from the lsf RFC thread [1] on top of Baolin's
v3 patches. You may find a version in my integration branch here [2]. I can
attach them here if it's preferred.
[1] https://lore.kernel.org/all/20240515055719.32577-1-da.gomez@xxxxxxxxxxx/
[2] https://gitlab.com/dkruces/linux-next/-/commits/next-20240604-shmem-mthp
The point here is to combine the large folios strategy I proposed with mTHP
user controls. Would it make sense to limit the orders to the mapping order
calculated based on the size and index?
@@ -1765,6 +1798,10 @@ static struct folio *shmem_alloc_and_add_folio(struct vm_fault *vmf,
order = highest_order(suitable_orders);
while (suitable_orders) {
+ if (order > mapping_order) {
+ order = next_order(&suitable_orders, order);
+ continue;
+ }
pages = 1UL << order;
index = round_down(index, pages);
folio = shmem_alloc_folio(gfp, order, info, index);
Note: The branch still need to be adapted to include !anon mm.
>
> >
> > >
> > > Also, we should properly fallback within the configured sizes, and not
> > > jump "over" configured sizes. Unless there is a good reason.
> > >
> > > (3) khugepaged
> > >
> > > khugepaged needs to handle larger folios properly as well. Until fixed,
> > > using smaller THP sizes as fallback might prohibit collapsing a
> > > PMD-sized THP later. But really, khugepaged needs to be fixed to handle
> > > that. >
> > > (4) force/disable
> > >
> > > These settings are rather testing artifacts from the old ages. We should
> > > not add them to the per-size toggles. We might "inherit" it from the
> > > global one, though.
> >
> > Sorry, I missed this. So I thould remove the 'force' and 'deny' option
> > for each mTHP, right?
>
> Yes, that's my understanding. But we have to keep them on the top level for
> any possible user out there.
>
> >
> > >
> > > "within_size" might have value, and especially for consistency, we
> > > should have them per size.
> > >
> > >
> > >
> > > So, this series only tackles anonymous shmem, which is a good starting
> > > point. Ideally, we'd get support for other shmem (especially during
> > > fault time) soon afterwards, because we won't be adding separate toggles
> > > for that from the interface POV, and having inconsistent behavior
> > > between kernel versions would be a bit unfortunate.
> > >
> > >
> > > @Baolin, this series likely does not consider (4) yet. And I suggest we
> > > have to take a lot of the "anonymous thp" terminology out of this
> > > series, especially when it comes to documentation.
> >
> > Sure. I will remove the "anonymous thp" terminology from the
> > documentation, but want to still keep it in the commit message, cause I
> > want to start from the anonymous shmem.
>
> For commit message and friends makes sense. The story should be "controls
> all of shmem/tmpfs, but support will be added iteratively. The first step is
> anonymous shmem."
>
> --
> Cheers,
>
> David / dhildenb
>