Re: [PATCH v5 00/21] Free some vmemmap pages of hugetlb page
From: Michal Hocko
Date: Fri Nov 20 2020 - 03:42:06 EST
On Fri 20-11-20 14:43:04, Muchun Song wrote:
[...]
Thanks for improving the cover letter and providing some numbers. I have
only glanced through the patchset because I didn't really have more time
to dive depply into them.
Overall it looks promissing. To summarize. I would prefer to not have
the feature enablement controlled by compile time option and the kernel
command line option should be opt-in. I also do not like that freeing
the pool can trigger the oom killer or even shut the system down if no
oom victim is eligible.
One thing that I didn't really get to think hard about is what is the
effect of vmemmap manipulation wrt pfn walkers. pfn_to_page can be
invalid when racing with the split. How do we enforce that this won't
blow up?
I have also asked in a previous version whether the vmemmap manipulation
should be really unconditional. E.g. shortlived hugetlb pages allocated
from the buddy allocator directly rather than for a pool. Maybe it
should be restricted for the pool allocation as those are considered
long term and therefore the overhead will be amortized and freeing path
restrictions better understandable.
> Documentation/admin-guide/kernel-parameters.txt | 9 +
> Documentation/admin-guide/mm/hugetlbpage.rst | 3 +
> arch/x86/include/asm/hugetlb.h | 17 +
> arch/x86/include/asm/pgtable_64_types.h | 8 +
> arch/x86/mm/init_64.c | 7 +-
> fs/Kconfig | 14 +
> include/linux/bootmem_info.h | 78 +++
> include/linux/hugetlb.h | 19 +
> include/linux/hugetlb_cgroup.h | 15 +-
> include/linux/memory_hotplug.h | 27 -
> mm/Makefile | 2 +
> mm/bootmem_info.c | 124 ++++
> mm/hugetlb.c | 163 ++++-
> mm/hugetlb_vmemmap.c | 765 ++++++++++++++++++++++++
> mm/hugetlb_vmemmap.h | 103 ++++
I will need to look closer but I suspect that a non-trivial part of the
vmemmap manipulation really belongs to mm/sparse-vmemmap.c because the
split and remapping shouldn't really be hugetlb specific. Sure hugetlb
knows how to split but all the splitting should be implemented in
vmemmap proper.
> mm/memory_hotplug.c | 116 ----
> mm/sparse.c | 5 +-
> 17 files changed, 1295 insertions(+), 180 deletions(-)
> create mode 100644 include/linux/bootmem_info.h
> create mode 100644 mm/bootmem_info.c
> create mode 100644 mm/hugetlb_vmemmap.c
> create mode 100644 mm/hugetlb_vmemmap.h
Thanks!
--
Michal Hocko
SUSE Labs