Re: [PATCH v2 2/5] mm: zswap: calculate limits only when updated
From: David Hildenbrand
Date: Fri Apr 12 2024 - 15:49:02 EST
On 10.04.24 02:52, Yosry Ahmed wrote:
[..]
Do we need a separate notifier chain for totalram_pages() updates?
Good question. I actually might have the requirement to notify some arch
code (s390x) from virtio-mem when fake adding/removing memory, and
already wondered how to best wire that up.
Maybe we can squeeze that into the existing notifier chain, but needs a
bit of thought.
Sorry for the late reply, I had to think about this a bit.
Do you mean by adding new actions (e.g. MEM_FAKE_ONLINE,
MEM_FAKE_OFFLINE), or by reusing the existing actions (MEM_ONLINE,
MEM_OFFLINE, etc).
At least for virtio-mem, I think we could have a MEM_ONLINE/MEM_OFFLINE
that prepare the whole range belonging to the Linux memory block
(/sys/devices/system/memory/memory...) to go online, and then have
something like MEM_SOFT_ONLINE/MEM_SOFT_OFFLINE or
ENABLE_PAGES/DISABLE_PAGES ... notifications when parts become usable
(!PageOffline, handed to the buddy) or unusable (PageOffline, removed
from the buddy).
There are some details to be figured out, but it could work.
And as virtio-mem currently operates in pageblock granularity (e.g., 2
MiB), but frequently handles multiple contiguous pageblocks within a
Linux memory block, it's not that bad.
But the issue I see with ballooning is that we operate here often on
page granularity. While we could optimize some cases, we might get quite
some overhead from all the notifications. Alternatively, we could send a
list of pages, but it won't win a beauty contest.
I think the main issue is that, for my purpose (virtio-mem on s390x), I
need to notify about the exact memory ranges (so I can reinitialize
stuff in s390x code when memory gets effectively re-enabled). For other
cases (total pages changing), we don't need the memory ranges, but only
the "summary" -- or a notification afterwards that the total pages were
just changed quite a bit.
New actions mean minimal impact to existing notifiers, but it may make
more sense to reuse MEM_ONLINE and MEM_OFFLINE to have generic actions
that mean "memory increased" and "memory decreased".
Likely, we should keep their semantics unchanged. Things like KASAN want
to allocate metadata memory for the whole range, not on some smallish
pieces. It really means "This Linux memory block goes online/offline,
please prepare for that.". And again, memory ballooning with small pages
is a bit problematic.
I suppose we can add new actions and then separately (and probably
incrementally) audit existing notifiers to check if they want to
handle the new actions as well.
Another consideration is that apparently some ballooning drivers also
register notifiers, so we need to make sure there is no possibility of
deadlock/recursion.
Right.
--
Cheers,
David / dhildenb