On Tue, Nov 29, 2022 at 10:32:49AM -0500, Waiman Long wrote:That is certainly possible. So we have to be careful about it.
On 11/29/22 09:07, Will Deacon wrote:Sure, I'm not saying there's a UAF on cpus_mask, but I'm concerned that we
On Mon, Nov 28, 2022 at 10:11:52AM -0500, Waiman Long wrote:A major difference between cpus_mask and user_cpus_ptr is that for
On 11/28/22 07:00, Will Deacon wrote:If the parent task can be modified concurrently with dup_task_struct() then
On Sun, Nov 27, 2022 at 08:43:27PM -0500, Waiman Long wrote:I missed that at first. The current task cloning process copies the content
On 11/24/22 21:39, Waiman Long wrote:Please can you elaborate on the use-after-free here? Looking at
In general, a non-null user_cpus_ptr will remain set until the task dies.This is actually a pre-existing use-after-free bug since commit 07ec77a1d4e8
A possible exception to this is the fact that do_set_cpus_allowed()
will clear a non-null user_cpus_ptr. To allow this possible racing
condition, we need to check for NULL user_cpus_ptr under the pi_lock
before duping the user mask.
Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
("sched: Allow task CPU affinity to be restricted on asymmetric systems").
So it needs to be fixed in the stable release as well. Will resend the patch
with an additional fixes tag and updated commit log.
07ec77a1d4e8, the mask is only freed in free_task() when the usage refcount
has dropped to zero and I can't see how that can race with fork().
What am I missing?
of the task structure over to the newly cloned/forked task. IOW, if
user_cpus_ptr had been set up previously, it will be copied over to the
cloned task. Now if user_cpus_ptr of the source task is cleared right after
that and before dup_user_cpus_ptr() is called. The obsolete user_cpus_ptr
value in the cloned task will remain and get used even if it has been freed.
That is what I call as use-after-free and double-free.
surely we'd have bigger issues because that's not going to be atomic? At the
very least we'd have a data race, but it also feels like we could end up
with inconsistent task state in the child. In fact, couldn't the normal
'cpus_mask' be corrupted by a concurrent set_cpus_allowed_common()?
Or am I still failing to understand the race?
cpus_mask, the bitmap is embedded into task_struct whereas user_cpus_ptr is
a pointer to an external bitmap. So there is no issue of use-after-free wrt
cpus_mask. That is not the case where the memory of the user_cpus_ptr of the
parent task is freed, but then a reference to that memory is still available
in the child's task struct and may be used.
could corrupt the data and end up with an affinity mask that doesn't correspond
to anything meaningful. Do you agree that's possible?
Yes, that is exactly where the problem is and this is what my patch is trying to fix.
Note that the problematic concurrence is not between the copying of taskWell, sort of, but the child only has the extra reference _because_ the parent
struct and changing of the task struct. It is what will happen after the
task struct copying has already been done with an extra reference present in
the child's task struct.
pointer was concurrently cleared to NULL, otherwise dup_user_cpus_ptr() would
have allocated a new copy and we'd be ok, no?
Overall, I'm just very wary that we seem to be saying that copy_process()
can run concurrently with changes to the parent. Maybe it's all been written
with that in mindi (including all the arch callbacks), but I'd be astonished
if this is the only problem in there.