On Thu, Mar 25, 2021 at 10:17:33AM +0100, Michal Hocko wrote:
Why do you think it is wrong to initialize/account pages when they are
used? Keep in mind that offline pages are not used until they are
onlined. But vmemmap pages are used since the vmemmap is established
which happens in the hotadd stage.
Yes, that is true.
vmemmap pages are used right when we populate the vmemmap space.
plus the fact that I dislike to place those pages in
ZONE_NORMAL, although they are not movable.
But I think the vmemmap pages should lay within the same zone the pages
they describe, doing so simplifies things, and I do not see any outright
downside.
Well, both ways likely have its pros and cons. Nevertheless, if the
vmemmap storage is independent (which is the case for normal hotplug)
then the state is consistent over hotadd, {online, offline} N times,
hotremove cycles. Which is conceptually reasonable as vmemmap doesn't
go away on each offline.
If you are going to bind accounting to the online/offline stages then
the accounting changes each time you go through the cycle and depending
on the onlining type it would travel among zones. I find it quite
confusing as the storage for vmemmap hasn't changed any of its
properties.
That is a good point I guess.
vmemmap pages do not really go away until the memory is unplugged.
But I see some questions to raise:
- As I said, I really dislike it tiding vmemmap memory to ZONE_NORMAL
unconditionally and this might result in the problems David mentioned.
I remember David and I discussed such problems but the problems with
zones not being contiguos have also been discussed in the past and
IIRC, we reached the conclusion that a maximal effort should be made
to keep them that way, otherwise other things suffer e.g: compaction
code.
So if we really want to move the initialization/account to the
hot-add/hot-remove stage, I would really like to be able to set the
proper zone in there (that is, the same zone where the memory will lay).
- When moving the initialization/accounting to hot-add/hot-remove,
the section containing the vmemmap pages will remain offline.
It might get onlined once the pages get online in online_pages(),
or not if vmemmap pages span a whole section.
I remember (but maybe David rmemeber better) that that was a problem
wrt. pfn_to_online_page() and hybernation/kdump.
So, if that is really a problem, we would have to care of ot setting
the section to the right state.
- AFAICS, doing all the above brings us to former times were some
initialization/accounting was done in a previous stage, and I remember
it was pushed hard to move those in online/offline_pages().
Are we ok with that?
As I said, we might have to set the right zone in hot-add stage, as
otherwise problems might come up.
Being that case, would not that also be conflating different concepts
at a wrong phases?
Do not take me wrong, I quite like Michal's idea, and from a
conceptually point of view I guess it is the right thing to do.
But when evualating risks/difficulty, I am not really sure.
If we can pull that off while setting the right zone (and must be seen
what about the section state), and the outcome is not ugly, I am all for
it.
Also a middel-ground might be something like I previously mentioned(having
a helper in memory_block_action() to do the right thing, so
offline/online_pages() do not get pouled.