Re: [PATCH v4] mm/hugetlb: Fix unsigned overflow in __nr_hugepages_store_common()

From: Mike Kravetz
Date: Tue Feb 26 2019 - 19:03:47 EST


On 2/26/19 2:36 PM, Andrew Morton wrote:
>> ...
>>
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -2274,7 +2274,7 @@ static int adjust_pool_surplus(struct hstate *h,
>> nodemask_t *nodes_allowed,
>
> Please tweak that email client to prevent the wordwraps.

Sorry about that.

>> + /*
>> + * Check for a node specific request. Adjust global count, but
>> + * restrict alloc/free to the specified node.
>> + */

Better comment might be:

/*
* Check for a node specific request.
* Changing node specific huge page count may require a corresponding
* change to the global count. In any case, the passed node mask
* (nodes_allowed) will restrict alloc/free to the specified node.
*/

>> + if (nid != NUMA_NO_NODE) {
>> + unsigned long old_count = count;
>> + count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
>> + /*
>> + * If user specified count causes overflow, set to
>> + * largest possible value.
>> + */

Updated comment:
/*
* User may have specified a large count value which caused the
* above calculation to overflow. In this case, they wanted
* to allocate as many huge pages as possible. Set count to
* largest possible value to align with their intention.
*/

>> + if (count < old_count)
>> + count = ULONG_MAX;
>> + }
>
> The above two comments explain the code, but do not reveal the
> reasoning behind the policy decisions which that code implements.
>
>> ...
>>
>> + } else {
>> /*
>> - * per node hstate attribute: adjust count to global,
>> - * but restrict alloc/free to the specified node.
>> + * Node specific request, but we could not allocate
>> + * node mask. Pass in ALL nodes, and clear nid.
>> */
>
> Ditto here, somewhat.

I was just going to update the comments and send you a new patch, but
but your comment got me thinking about this situation. I did not really
change the way this code operates. As a reminder, the original code is like:

NODEMASK_ALLOC(nodemask_t, nodes_allowed, GFP_KERNEL | __GFP_NORETRY);

if (nid == NUMA_NO_NODE) {
/* do something */
} else if (nodes_allowed) {
/* do something else */
} else {
nodes_allowed = &node_states[N_MEMORY];
}

So, the only way we get to that final else if if we can not allocate
a node mask (kmalloc a few words). Right? I wonder why we should
even try to continue in this case. Why not just return right there?

The specified count value is either a request to increase number of
huge pages or decrease. If we can't allocate a few words, we certainly
are not going to find memory to create huge pages. There 'might' be
surplus pages which can be converted to permanent pages. But remember
this is a 'node specific' request and we can't allocate a mask to pass
down to the conversion routines. So, chances are good we would operate
on the wrong node. The same goes for a request to 'free' huge pages.
Since, we can't allocate a node mask we are likely to free them from
the wrong node.

Unless my reasoning above is incorrect, I think that final else block
in __nr_hugepages_store_common() is wrong.

Any additional thoughts?
--
Mike Kravetz