Re: [PATCH v3 7/7] mm: switch deferred split shrinker to list_lru

Next message: JP Kobryn (Meta): "Re: [RESEND PATCH v2] btrfs: prevent direct reclaim during compressed readahead"
Previous message: Steven Rostedt: "[GIT PULL] rtla: Fix build failure without libbpf"
In reply to: Kairui Song: "Re: [PATCH v3 7/7] mm: switch deferred split shrinker to list_lru"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Johannes Weiner

Date: Mon Mar 30 2026 - 12:52:25 EST

On Fri, Mar 27, 2026 at 03:51:07PM +0800, Kairui Song wrote:
> On Thu, Mar 19, 2026 at 4:05 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> > @@ -4651,13 +4651,19 @@ static struct folio *alloc_swap_folio(struct vm_fault *vmf)
> > while (orders) {
> > addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << order);
> > folio = vma_alloc_folio(gfp, order, vma, addr);
> > - if (folio) {
> > - if (!mem_cgroup_swapin_charge_folio(folio, vma->vm_mm,
> > - gfp, entry))
> > - return folio;
> > + if (!folio)
> > + goto next;
> > + if (mem_cgroup_swapin_charge_folio(folio, vma->vm_mm, gfp, entry)) {
> > count_mthp_stat(order, MTHP_STAT_SWPIN_FALLBACK_CHARGE);
> > folio_put(folio);
> > + goto next;
> > }
> > + if (folio_memcg_list_lru_alloc(folio, &deferred_split_lru, gfp)) {
> > + folio_put(folio);
> > + goto fallback;
> > + }
>
> Hi Johannes,
>
> Haven't checked every detail yet, but one question here, might be
> trivial, will it be better if we fallback to the next order instead of
> fallback to 0 order directly? Suppose this is a 2M allocation and 1M
> fallback is allowed, releasing that folio and fallback to 1M will free
> 1M memory which would be enough for the list lru metadata to be
> allocated.

I would be surprised if that mattered. If we can get a 2M folio but
fail a couple of small slab requests, there is probably such extreme
levels of concurrency and pressure on the freelists that the fault has
a good chance of failing altogether and OOMing.

And if it doesn't matter, then let's consider it from a code clarity
point of view. For folio allocation and charging, we reduce the size
to try again. But the list_lru allocation is always the same size - it
would look weird to just try again on failure. If we do so based on
the logic you lay out above, now it needs a comment too...