Re: [syzbot] [net?] WARNING: refcount bug in inet_twsk_kill
From: Jason Xing
Date: Mon Aug 12 2024 - 11:51:41 EST
On Mon, Aug 12, 2024 at 11:03 PM Florian Westphal <fw@xxxxxxxxx> wrote:
>
> Jason Xing <kerneljasonxing@xxxxxxxxx> wrote:
> > > I don't see how this helps, we need to wait until 'stolen' twsk
> > > has gone through inet_twsk_kill() and decremented tw_refcount.
> > > Obviously It would be a bit simpler if we had a reliable reproducer :-)
> >
> > Allow me to say something irrelevant to this bug report.
> >
> > Do you think that Kuniyuki's patch can solve the race between two
> > 'killers' calling inet_twsk_deschedule_put()->inet_twsk_kill()
> > concurrently at two cores, say, inet_twsk_purge() and tcp_abort()?
>
> I don't think its possible, tcp_abort() calls inet_twsk_deschedule_put,
> which does:
>
> if (timer_shutdown_sync(&tw->tw_timer))
> inet_twsk_kill(tw);
>
> So I don't see how two concurrent callers, working on same tw address,
> would both be able to shut down the timer.
>
> One will shut it down and calls inet_twsk_kill(), other will wait until
> the callback has completed, but it doesn't call inet_twsk_kill().
Oh, thanks. Since timer_shutdown_sync() can be used as a lock, there
is indeed no way to call inet_twsk_kill() concurrently.
>
> > It at least does help avoid decrementing tw_refcount twice in the
> > above case if I understand correctly.
>
> I don't think the refcount is decremented twice.
>
> Problem is one thread is already at the 'final' decrement of
> WARN_ON_ONCE(!refcount_dec_and_test(&net->ipv4.tcp_death_row.tw_refcount));
>
> in tcp_sk_exit_batch(), while other thread has not yet called
> refcount_dec() on it (inet_twsk_kill still executing).
>
> So we get two splats, refcount_dec_and_test() returns 1 not expected 0
> and refcount_dec() coming right afterwards from other task observes the
> transition to 0, while it should have dropped down to 1.
I see.
Thanks,
Jason