Re: [PATCH v4] mm/hugetlb: add mempolicy check in the reservation routine

From: Baoquan He
Date: Wed Jul 29 2020 - 06:34:12 EST


On 07/28/20 at 09:46am, Mike Kravetz wrote:
> On 7/28/20 6:24 AM, Baoquan He wrote:
> > Hi Muchun,
> >
> > On 07/28/20 at 11:49am, Muchun Song wrote:
> >> In the reservation routine, we only check whether the cpuset meets
> >> the memory allocation requirements. But we ignore the mempolicy of
> >> MPOL_BIND case. If someone mmap hugetlb succeeds, but the subsequent
> >> memory allocation may fail due to mempolicy restrictions and receives
> >> the SIGBUS signal. This can be reproduced by the follow steps.
> >>
> >> 1) Compile the test case.
> >> cd tools/testing/selftests/vm/
> >> gcc map_hugetlb.c -o map_hugetlb
> >>
> >> 2) Pre-allocate huge pages. Suppose there are 2 numa nodes in the
> >> system. Each node will pre-allocate one huge page.
> >> echo 2 > /proc/sys/vm/nr_hugepages
> >>
> >> 3) Run test case(mmap 4MB). We receive the SIGBUS signal.
> >> numactl --membind=0 ./map_hugetlb 4
> >
> > I think supporting the mempolicy of MPOL_BIND case is a good idea.
> > I am wondering what about the other mempolicy cases, e.g MPOL_INTERLEAVE,
> > MPOL_PREFERRED. Asking these because we already have similar handling in
> > sysfs, proc nr_hugepages_mempolicy writting. Please see
> > __nr_hugepages_store_common() for detail.
>
> There is a high level difference in the function of this code and the code
> called by the sysfs and proc interfaces. This patch is dealing with reserving
> huge pages in the pool for later use. The sysfs and proc interfaces are
> allocating huge pages to be added to the pool.
>
> Using mempolicy to decide how to allocate huge pages is pretty straight
> forward. Using mempolicy to reserve pages is almost impossible to get
> correct. The comment at the beginning of hugetlb_acct_memory() and modified
> by this patch summarizes the issues.
>
> IMO, at this time it makes little sense to perform checks for more than
> MPOL_BIND at reservation time. If we ever take on the monumental task of
> supporting mempolicy directed per-node reservations throughout the life of
> a process, support for other policies will need to be taken into account.

I haven't figured out the difficulty of using mempolicy very clearly, will
read more codes and digest and understand your words. Thanks a lot for
these details.

Thanks
Baoquan