Re: [PATCH v9 3/6] mm: memcontrol: add interface for swap tier selection
From: Yosry Ahmed
Date: Tue Jun 23 2026 - 14:11:10 EST
On Mon, Jun 22, 2026 at 5:40 PM Joshua Hahn <joshua.hahnjy@xxxxxxxxx> wrote:
>
> On Mon, 22 Jun 2026 23:46:31 +0000 Yosry Ahmed <yosry@xxxxxxxxxx> wrote:
>
> > > > > If that is the case, I think auto-scaling makes sense but can be a bit
> > > > > tricky, since there is no universal tiered ratio; each workload will
> > > > > have different tiers it can swap to, so they will all have to calculate
> > > > > their own ratios. Tiered memory limits escapes this difficulty since we
> > > > > assume all memory can be placed on all tiers, so we have a system-wide
> > > > > ratio : -)
> > > >
> > > > Hmm I don't follow. It's also possible (maybe not initially) that a
> > > > memcg cannot use specific memory tiers, right? I am not sure what the
> > > > difference is.
> > >
> > > You're right, I was speaking more to the current state of memory tiers.
> > > The majority of the feedack I received was that we already have too
> > > many memcg knobs, so I just opted to make tiered memcg limits a
> > > cgroup mount, with no ability for individual memcgs to tune their
> > > limits or opt-in/out.
> >
> > Right, I think this is similar to the approach taken here. We have a
> > single interface for per-tier limits. The main difference is that we're
> > allowing 0/max values to disable/enable different swap tiers per-memcg,
> > as there's a use case for that.
> >
> > Seems like for memory tiering there's no use case for that yet.
>
> Yes, I would agree with that.
>
> > > What do you think Yosry? Would it make sense for us to be able to
> > > tune these values? Personally I think it makes sense but just wanted to
> > > make the basic features merged before I went to push for making those
> > > knobs tunable.
> >
> > Right now we're not proposing to allow tuning swap tier limits either,
> > just enable or disable a tier. My main question is about the default
> > values.
> >
> > IIUC, for memory tiering, if you set memory.max, then the limits for
> > tiers are auto-scaled. I think it makes sense to do the same for swap
> > tiers for cosnsitency. Or am I wrong about the memory tiering limits
> > behavior?
>
> No, you're right about that. Sorry for steering the thread to my
> series ; -)
>
> To get back to the question of how the auto-tuning should work, the
> main question is to which ratio we scale the swap limits to.
> Do we set the swap limits proportional to how much swap is present
> in the system, or how much swap is available to the cgroup?
>
> So if we have 3 swap tiers A, B, C, with 50G, 30G, and 20G capacity
> respectively, how much should a cgroup with swap.max = 10G have if
> it is limited to tiers A and B?
>
> This is what I was getting at earlier when I said we have to calculate
> different ratios for different cgroups, based on what tiers they have
> access to.
That's a good question. I think the case that is particularly
interesting is whether or not the limits of other tiers should change
when another tier is disabled/enabled.
So basically in your example, assuming everything starts as "max",
when swap.max is set to 10G, the autoscaled limits would be: (tier A,
5G), (tier B, 3G), (tier C, 2G). Now the question becomes, if
userspace sets the limit of tier C to 0, should the limits for tiers A
and B change?
On one hand, it's simpler to just keep the autoscaled limits unchanged
in this case. However, this means that the effective swap limit is now
8G, which is not great :/
The alternative is to recalculate all the limits when one of them
changes, in which case the limits of A and B would change to 6.25G and
3.75G. But I don't know if this will work well if we allow custom
limits. What happens if the limit of tier C is written as 1 (or 4096)
instead of 0? It's effectively the same scenario, but the tier is
technically allowed.
The more I think about it, the more I realize it may be best to drop
the autoscaling thing. I imagine memory tiering might run into similar
issues too :/