Re: [PATCH 14/16] mm/migration: fix potential invalid node access for reclaim-based migration

From: Miaohe Lin
Date: Tue Mar 08 2022 - 06:47:31 EST


On 2022/3/7 15:04, Baolin Wang wrote:
>
>
> On 3/7/2022 1:14 PM, Huang, Ying wrote:
>> Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> writes:
>>
>>> On 3/4/2022 5:34 PM, Miaohe Lin wrote:
>>>> If we failed to setup hotplug state callbacks for mm/demotion:online in
>>>> some corner cases, node_demotion will be left uninitialized. Invalid node
>>>> might be returned from the next_demotion_node() when doing reclaim-based
>>>> migration. Use kcalloc to allocate node_demotion to fix the issue.
>>>> Fixes: ac16ec835314 ("mm: migrate: support multiple target nodes
>>>> demotion")
>>>> Signed-off-by: Miaohe Lin <linmiaohe@xxxxxxxxxx>
>>>> ---
>>>>    mm/migrate.c | 6 +++---
>>>>    1 file changed, 3 insertions(+), 3 deletions(-)
>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>> index 279940c0c064..7b1c0b988234 100644
>>>> --- a/mm/migrate.c
>>>> +++ b/mm/migrate.c
>>>> @@ -2516,9 +2516,9 @@ static int __init migrate_on_reclaim_init(void)
>>>>    {
>>>>        int ret;
>>>>    -    node_demotion = kmalloc_array(nr_node_ids,
>>>> -                      sizeof(struct demotion_nodes),
>>>> -                      GFP_KERNEL);
>>>> +    node_demotion = kcalloc(nr_node_ids,
>>>> +                sizeof(struct demotion_nodes),
>>>> +                GFP_KERNEL);
>>>
>>> Nit: not sure if this is worthy of this rare corner case, but I think
>>> the target demotion nodes' default value should be NUMA_NO_NODE
>>> instead of 0.
>>
>> The "nr" field of "struct demotion_nodes" should be initialized as 0.  I
>> think that is checked before "nodes[]" field.
>
> Right, but it will be confusing that if nr = 0, while the nodes[] still contains valid node id 0. While we are at this, why not initialize the node_demotion structure with a clear default value? Anyway, no strong opinion on this :)

IMO, this might not deserve initializing the node_demotion structure with a clear default value as
cpuhp_setup_state fails at init time should be a rare case.

Thanks both of you.

> .