Re: [PATCH v6] taskstats: fix data-race

From: Andrea Parri
Date: Wed Oct 23 2019 - 08:16:14 EST


On Mon, Oct 21, 2019 at 01:33:27PM +0200, Christian Brauner wrote:
> When assiging and testing taskstats in taskstats_exit() there's a race
> when writing and reading sig->stats when a thread-group with more than
> one thread exits:
>
> cpu0:
> thread catches fatal signal and whole thread-group gets taken down
> do_exit()
> do_group_exit()
> taskstats_exit()
> taskstats_tgid_alloc()
> The tasks reads sig->stats without holding sighand lock.
>
> cpu1:
> task calls exit_group()
> do_exit()
> do_group_exit()
> taskstats_exit()
> taskstats_tgid_alloc()
> The task takes sighand lock and assigns new stats to sig->stats.
>
> The first approach used smp_load_acquire() and smp_store_release().
> However, after having discussed this it seems that the data dependency
> for kmem_cache_alloc() would be fixed by WRITE_ONCE().
> Furthermore, the smp_load_acquire() would only manage to order the stats
> check before the thread_group_empty() check. So it seems just using
> READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this
> up for discussion at least.

Mmh, the RELEASE was intended to order the memory initialization in
kmem_cache_zalloc() with the later ->stats pointer assignment; AFAICT,
there is no data dependency between such memory accesses.

Correspondingly, the ACQUIRE was intended to order the ->stats pointer
load with later, _independent dereferences of the same pointer; the
latter are, e.g., in taskstats_exit() (but not thread_group_empty()).

Looking again, I see that fill_tgid_exit()'s dereferences of ->stats
are protected by ->siglock: maybe you meant to rely on such a critical
section pairing with the critical section in taskstats_tgid_alloc()?

That memcpy(-, tsk->signal->stats, -) at the end of taskstats_exit()
also bugs me: could these dereferences of ->stats happen concurrently
with other stores to the same memory locations?

Thanks,
Andrea