答复: [External Mail]Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()

From: David Wang 王标
Date: Wed Nov 30 2022 - 06:52:17 EST


Dear Will, Longman,

Could we fix the issue first we met? We can analyze other issue later.

Thanks


-----邮件原件-----
发件人: Waiman Long <longman@xxxxxxxxxx>
发送时间: 2022年11月30日 0:04
收件人: Will Deacon <will@xxxxxxxxxx>
抄送: Ingo Molnar <mingo@xxxxxxxxxx>; Peter Zijlstra <peterz@xxxxxxxxxxxxx>; Juri Lelli <juri.lelli@xxxxxxxxxx>; Vincent Guittot <vincent.guittot@xxxxxxxxxx>; Dietmar Eggemann <dietmar.eggemann@xxxxxxx>; Steven Rostedt <rostedt@xxxxxxxxxxx>; Ben Segall <bsegall@xxxxxxxxxx>; Mel Gorman <mgorman@xxxxxxx>; Daniel Bristot de Oliveira <bristot@xxxxxxxxxx>; Phil Auld <pauld@xxxxxxxxxx>; Wenjie Li <wenjieli@xxxxxxxxxxxxxxxx>; David Wang 王标 <wangbiao3@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx
主题: [External Mail]Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()

[外部邮件] 此邮件来源于小米公司外部,请谨慎处理。若对邮件安全性存疑,请将邮件转发给misec@xxxxxxxxxx进行反馈

On 11/29/22 10:57, Will Deacon wrote:
> On Tue, Nov 29, 2022 at 10:32:49AM -0500, Waiman Long wrote:
>> On 11/29/22 09:07, Will Deacon wrote:
>>> On Mon, Nov 28, 2022 at 10:11:52AM -0500, Waiman Long wrote:
>>>> On 11/28/22 07:00, Will Deacon wrote:
>>>>> On Sun, Nov 27, 2022 at 08:43:27PM -0500, Waiman Long wrote:
>>>>>> On 11/24/22 21:39, Waiman Long wrote:
>>>>>>> In general, a non-null user_cpus_ptr will remain set until the task dies.
>>>>>>> A possible exception to this is the fact that
>>>>>>> do_set_cpus_allowed() will clear a non-null user_cpus_ptr. To
>>>>>>> allow this possible racing condition, we need to check for NULL
>>>>>>> user_cpus_ptr under the pi_lock before duping the user mask.
>>>>>>>
>>>>>>> Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in
>>>>>>> do_set_cpus_allowed()")
>>>>>>> Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
>>>>>> This is actually a pre-existing use-after-free bug since commit
>>>>>> 07ec77a1d4e8
>>>>>> ("sched: Allow task CPU affinity to be restricted on asymmetric systems").
>>>>>> So it needs to be fixed in the stable release as well. Will
>>>>>> resend the patch with an additional fixes tag and updated commit log.
>>>>> Please can you elaborate on the use-after-free here? Looking at
>>>>> 07ec77a1d4e8, the mask is only freed in free_task() when the usage
>>>>> refcount has dropped to zero and I can't see how that can race with fork().
>>>>>
>>>>> What am I missing?
>>>> I missed that at first. The current task cloning process copies the
>>>> content of the task structure over to the newly cloned/forked task.
>>>> IOW, if user_cpus_ptr had been set up previously, it will be copied
>>>> over to the cloned task. Now if user_cpus_ptr of the source task is
>>>> cleared right after that and before dup_user_cpus_ptr() is called.
>>>> The obsolete user_cpus_ptr value in the cloned task will remain and get used even if it has been freed.
>>>> That is what I call as use-after-free and double-free.
>>> If the parent task can be modified concurrently with
>>> dup_task_struct() then surely we'd have bigger issues because that's
>>> not going to be atomic? At the very least we'd have a data race, but
>>> it also feels like we could end up with inconsistent task state in
>>> the child. In fact, couldn't the normal 'cpus_mask' be corrupted by a concurrent set_cpus_allowed_common()?
>>>
>>> Or am I still failing to understand the race?
>>>
>> A major difference between cpus_mask and user_cpus_ptr is that for
>> cpus_mask, the bitmap is embedded into task_struct whereas
>> user_cpus_ptr is a pointer to an external bitmap. So there is no
>> issue of use-after-free wrt cpus_mask. That is not the case where the
>> memory of the user_cpus_ptr of the parent task is freed, but then a
>> reference to that memory is still available in the child's task struct and may be used.
> Sure, I'm not saying there's a UAF on cpus_mask, but I'm concerned
> that we could corrupt the data and end up with an affinity mask that
> doesn't correspond to anything meaningful. Do you agree that's possible?
That is certainly possible. So we have to be careful about it.
>
>> Note that the problematic concurrence is not between the copying of
>> task struct and changing of the task struct. It is what will happen
>> after the task struct copying has already been done with an extra
>> reference present in the child's task struct.
> Well, sort of, but the child only has the extra reference _because_
> the parent pointer was concurrently cleared to NULL, otherwise
> dup_user_cpus_ptr() would have allocated a new copy and we'd be ok, no?
Yes, that is exactly where the problem is and this is what my patch is trying to fix.
>
> Overall, I'm just very wary that we seem to be saying that
> copy_process() can run concurrently with changes to the parent. Maybe
> it's all been written with that in mindi (including all the arch
> callbacks), but I'd be astonished if this is the only problem in there.

It seems like that, at least in some cases, the clearing of a task's user_cpus_ptr can be done by another task. So the parent may be unaware of it and so is not its fault.

Cheers,
Longman

#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#