Re: [Bugfix] sched: fix possible invalid memory access caused by CPU hot-addition

From: Jiang Liu
Date: Wed Apr 23 2014 - 23:00:18 EST


On 2014/4/24 1:46, Luck, Tony wrote:
>>>> 1) Handle CPU hot-addition event
>>>> 1.a) gather platform specific information
>>>> 1.b) associate hot-added CPU with a node
>>>> 1.c) create CPU device
>>>> 2) User online hot-added CPUs through sysfs:
>>>> 2.a) cpu_up()
>>>> 2.b) ->try_online_node()
>>>> 2.c) ->hotadd_new_pgdat()
>>>> 2.d) ->node_set_online()
>>>>
>>>> So between 1.b and 2.c, kmalloc_node(nid) may cause invalid
>>>> memory access without the node_online(nid) check.
>>>
>>> Any why was all this not in the Changelog?
>>
>> Also, do explain what kind of hardware you needed to trigger this. This
>> code has been like this for a good while.
>
> With your proposed fix in place the allocations will succeed - but they
> will be done from other nodes ... and this cpu will have to do a remote
> NUMA access for the rest of time.
>
> It would be better to switch the order above - add the memory first,
> then add the cpus. Is that possible?
Hi Tony,
The BIOS always sends CPU hot-addition events before memory
hot-addition events, so it's hard to change the order.
And we couldn't completely solve this performance penalty because the
affected code tries to allocate memory for all possible
CPUs instead of onlined CPUs.

Best Regards!
Gerry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/