Re: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr
From: Dietmar Eggemann
Date: Mon Sep 23 2019 - 11:43:19 EST
On 9/15/19 4:33 PM, Valentin Schneider wrote:
> On 15/09/2019 09:21, shikemeng wrote:
>>> It's more thoughtful to add check in cpumask_test_cpu.It can solve this problem and can prevent other potential bugs.I will test it and resend
>>> a new patch.
>>>
>>
>> Think again and again. As cpumask_check will fire a warning if cpu >= nr_cpu_ids, it seems that cpumask_check only expects cpu < nr_cpu_ids and it's
>> caller's responsibility to very cpu is in valid range. Interfaces like cpumask_test_and_set_cpu, cpumask_test_and_clear_cpu and so on are not checking
>> cpu < nr_cpu_ids either and may cause demage if cpu is out of range.
>>
>
> cpumask operations clearly should never be fed CPU numbers > nr_cpu_ids,
> but we can get some sneaky mishaps like the one you're fixing. The answer
> might just be to have more folks turn on DEBUG_PER_CPU_MAPS in their test
> runs (I don't for instance - will do from now on), since I get the feeling
> people like to be able to disable these checks for producty kernels.
>
> In any case, don't feel like you have to fix this globally - your fix is
> fine on its own.
I'm not sure that CONFIG_DEBUG_PER_CPU_MAPS=y will help you here.
__set_cpus_allowed_ptr(...)
{
...
dest_cpu = cpumask_any_and(...)
...
}
With:
#define cpumask_any_and(mask1, mask2) cpumask_first_and((mask1), (mask2))
#define cpumask_first_and(src1p, src2p) cpumask_next_and(-1, (src1p),
(src2p))
cpumask_next_and() is called with n = -1 and in this case does not
invoke cpumask_check().
---
BTW, I can recreate the issue quite easily with:
qemu-system-x86_64 ... -smp cores=64 ... -enable-kvm
with the default kernel config.