Re: [PATCH v4 2/5] mm/zswap: Factor writeback loop out of shrink_worker()

From: Hao Jia

Date: Mon Jun 29 2026 - 07:54:30 EST




On 2026/6/27 01:09, Yosry Ahmed wrote:
/*
* Take one step of a memcg-tree writeback walk driven by the caller's
* iterator, and fold the result into @s, the retry bookkeeping shared
* across steps. @memcg is the iterator's current memcg, or NULL once
* it has wrapped around after a full pass over the tree.
*
* The function returns -EAGAIN to signal the caller to abort the walk
* after encountering the following conditions MAX_RECLAIM_RETRIES times:
* - No writeback-candidate memcgs were found in a memcg tree walk.
* - Shrinking a writeback-candidate memcg failed.

Orthogonal to this patch, but I wonder if this can be simplified. I
wonder if these two conditions can be replaced with "shrinking a memcg
that has zswap entries failed". The "no writeback-candidate memcgs in
the tree" case seems like we should abort right away instead of
retrying?

Nhat, WDYT?


Perhaps something like the following is what you had in mind? I've
drafted the implementation below to make it easier for Nhat to compare
with the previous behavior.

Hmm I think if we pursue this it should be in a separate patch or even
outside of this series, ideally with numbers/proof that it's not
introducing regressions to the scenario that lead to its introduction.


Got it. I will temporarily leave this part out of the current patch series.



*
* Return: The number of compressed bytes written back (>= 0), or -EAGAIN
* once the retry budget is exhausted and the caller should abort the walk.
*/
static long zswap_shrink_one(struct mem_cgroup *memcg,

Nit: zswap_shrink_one_memcg()

BTW, the existing writeback logic has been broken for a while now when
memcg is disabled. I think we constantly hit the !memcg case and run
out of retries. Not sure if your patch changes this in any way, or if
you want to fix that while you're at it :)

Yes, I'd be happy to do that. However, would it be better to submit a
separate fix patch or combine it with this one?

A separate patch. Feel free to send it with this series to avoid
conflicts, but probably as patch 1 as we'll want to CC stable on it.


Done, and I've just submitted the v5 patch.

[v5] https://lore.kernel.org/all/20260629112032.20423-1-jiahao.kernel@xxxxxxxxx

Please take a look when you have a chance. Thank you very much for your review!

[..]

/* Track progress of a memcg-tree writeback walk. */
struct zswap_shrink_state {
int scans;
int failures;
};

/*
* Take one step of a memcg-tree writeback walk driven by the caller's
* iterator, and fold the result into @s, the retry bookkeeping shared
* across steps. @memcg is the iterator's current memcg, or NULL once
* it has wrapped around after a full pass over the tree.
*
* The function returns -EBUSY to signal the caller to abort the walk when
* either of the following occurs:
* - A full pass over the tree found no writeback-candidate memcg.
* - Shrinking a writeback-candidate memcg failed MAX_RECLAIM_RETRIES
times.
*
* When memory cgroup is disabled, the iterator always yields NULL. All
* zswap entries then live on the root list_lru, so NULL is treated as the
* root memcg and shrunk directly rather than as a completed tree pass.

I think this chunk should be moved above the code returning -EBUSY
when mem_cgroup_disabled() is true, and probably made more succinct as
it should be obvious.

Done in v5.


*
* Return: The number of compressed bytes written back (>= 0), or -EBUSY
* when the caller should abort the walk.
*/
static long zswap_shrink_one_memcg(struct mem_cgroup *memcg,
struct zswap_shrink_state *s)
{
bool disabled = mem_cgroup_disabled();

No need to store this in a variable AFAICT, it's a static branch and
it's clearer to just call it directly in both call sites imo.

Done, and I've just submitted the v5 patch.

Thanks,
Hao

long shrunk;

/*
* If the iterator has completed a full pass, update the shrink state
* and check whether we should keep going.
* With memcg disabled the iterator always yields NULL, so fall through
* and shrink the root memcg directly instead.
*/
if (!memcg && !disabled) {
/*
* Abort if no writeback-candidate memcgs in the last tree walk.
* Otherwise reset the scans count and continue.
*/
if (!s->scans)
return -EBUSY;
s->scans = 0;
return 0;
}

shrunk = shrink_memcg(memcg, NR_ZSWAP_WB_BATCH);

/*
* There are no writeback-candidate pages in the memcg. With memcg
* enabled this is not an issue as long as we can find another memcg
* with pages in zswap, so skip without counting it as a candidate.
* With memcg disabled the root LRU is the only target, so we should
* abort if it has no writeback-candidate pages.
*/
if (shrunk == -ENOENT)
return disabled ? -EBUSY : 0;
s->scans++;

if (shrunk <= 0 && ++s->failures == MAX_RECLAIM_RETRIES)
return -EBUSY;

return shrunk;
}