Re: [PATCH v1] memory-hotplug.rst: complete admin-guide overhaul

From: David Hildenbrand
Date: Mon Jun 07 2021 - 04:09:07 EST


On 07.06.21 09:51, Michal Hocko wrote:
[Sorry this somehow slipped through cracks]

On Tue 25-05-21 12:26:04, David Hildenbrand wrote:
The memory hot(un)plug documentation is outdated and incomplete. Most of
the content dates back to 2007, so it's time for a major overhaul.

Let's rewrite, reorganize and update most parts of the documentation. In
addition to memory hot(un)plug, also add some details regarding
ZONE_MOVABLE, with memory hotunplug being one of its main consumers.

The style of the document is also properly fixed that e.g., "restview"
renders it cleanly now.

In the future, we might add some more details about virt users like
virtio-mem, the XEN balloon, the Hyper-V balloon and ppc64 dlpar.

I haven't really checked the diff but rather looked at the final
outcome. I have to say I like it a lot. Some places are going a bit too
technical for an admin-guide but they are in minority (e.g. locking or
altmap reference). If somebody feels strong then this could get into its
own file but I wouldn't lose sleep over that.

I just realized that "Locking Internals" is still a duplicate with core-api/memory-hotplug.rst

I thought this (merge issue) was fixed long ago. I'll just remove it from this file, making sure core-api/memory-hotplug.rst is up-to-date


I would make one thing slightly more explicit though
diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
index c95f5c2b30dd..5b462aba89cc 100644
--- a/Documentation/admin-guide/mm/memory-hotplug.rst
+++ b/Documentation/admin-guide/mm/memory-hotplug.rst
@@ -568,6 +568,10 @@ Even with ZONE_MOVABLE, there are some corner cases where offlining a memory
Further, when running into out of memory situations while migrating pages, or
when still encountering permanently unmovable pages within ZONE_MOVABLE
(-> BUG), memory offlining will keep retrying until it eventually succeeds.
+The offlining context can be terminated by a fatal signal. A timeout based
+offlining can be easily implemented by
+
+ % timeout $TIMEOUT offline_block | failure_handling
Locking Internals
=================


In the future I would find some examples of a failure cases we have seen
so far. E.g. offlining failure with dump_page example.


Thanks, I'll pick up your change and see if I can get rid of some low-level stuff. I'll address Mike's review shortly and resend.

--
Thanks,

David / dhildenb