Re: [PATCH v2 2/5] mm: zswap: calculate limits only when updated

From: David Hildenbrand
Date: Mon Apr 15 2024 - 15:15:38 EST


On 15.04.24 20:30, Yosry Ahmed wrote:
On Mon, Apr 15, 2024 at 8:10 AM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 13.04.24 03:05, Yosry Ahmed wrote:
On Fri, Apr 12, 2024 at 12:48 PM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 10.04.24 02:52, Yosry Ahmed wrote:
[..]
Do we need a separate notifier chain for totalram_pages() updates?

Good question. I actually might have the requirement to notify some arch
code (s390x) from virtio-mem when fake adding/removing memory, and
already wondered how to best wire that up.

Maybe we can squeeze that into the existing notifier chain, but needs a
bit of thought.


Sorry for the late reply, I had to think about this a bit.

Do you mean by adding new actions (e.g. MEM_FAKE_ONLINE,
MEM_FAKE_OFFLINE), or by reusing the existing actions (MEM_ONLINE,
MEM_OFFLINE, etc).

At least for virtio-mem, I think we could have a MEM_ONLINE/MEM_OFFLINE
that prepare the whole range belonging to the Linux memory block
(/sys/devices/system/memory/memory...) to go online, and then have
something like MEM_SOFT_ONLINE/MEM_SOFT_OFFLINE or
ENABLE_PAGES/DISABLE_PAGES ... notifications when parts become usable
(!PageOffline, handed to the buddy) or unusable (PageOffline, removed
from the buddy).

There are some details to be figured out, but it could work.

And as virtio-mem currently operates in pageblock granularity (e.g., 2
MiB), but frequently handles multiple contiguous pageblocks within a
Linux memory block, it's not that bad.


But the issue I see with ballooning is that we operate here often on
page granularity. While we could optimize some cases, we might get quite
some overhead from all the notifications. Alternatively, we could send a
list of pages, but it won't win a beauty contest.

I think the main issue is that, for my purpose (virtio-mem on s390x), I
need to notify about the exact memory ranges (so I can reinitialize
stuff in s390x code when memory gets effectively re-enabled). For other
cases (total pages changing), we don't need the memory ranges, but only
the "summary" -- or a notification afterwards that the total pages were
just changed quite a bit.


Thanks for shedding some light on this. Although I am not familiar
with ballooning, sending notifications on page granularity updates
sounds terrible. It seems like this is not as straightforward as I had
anticipated.

I was going to take a stab at this, but given that the motivation is a
minor optimization on the zswap side, I will probably just give up :)

Oh no, so I have to do the work! ;)


For now, I will drop this optimization from the series for now, and I
can revisit it if/when notifications for totalram_pages() are
implemented at some point. Please CC me if you do so for the s390x use
case :)

I primarily care about virtio-mem resizing VM memory and adjusting
totalram_pages(), memory ballooning for that is rather a hack for that
use case ... so we're in agreement :)

Likely we'd want two notification mechanisms, but no matter how I look
at it, it's all a bit ugly.

I am assuming you mean one with exact memory ranges for your s390x use
case, and one high-level mechanism for totalram_pages() updates -- or
did I miss the point?

No, that's it.


I am interested to see how page granularity updates would be handled
in this case. Perhaps they are only relevant for the high-level
mechanism? In that case, I suppose we can batch updates and notify
once when a threshold is crossed or something.

Yes, we'd batch updates.

--
Cheers,

David / dhildenb