Re: net: use-after-free in tw_timer_handler

From: Dmitry Vyukov
Date: Tue Jan 24 2017 - 10:06:48 EST


On Tue, Jan 24, 2017 at 3:28 PM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
> On Mon, 2017-01-23 at 11:23 +0100, Dmitry Vyukov wrote:
>> On Mon, Jan 23, 2017 at 11:19 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>> > Hello,
>> >
>> > While running syzkaller fuzzer I started seeing use-after-frees in
>> > tw_timer_handler. It happens with very low frequency, so far I've seen
>> > 22 of them. But all reports look consistent, so I would assume that it
>> > is real, just requires a very tricky race to happen. I've stared
>> > seeing it around Jan 17, however I did not update kernels for some
>> > time before that so potentially the issues was introduced somewhat
>> > earlier. Or maybe fuzzer just figured how to trigger it, and the bug
>> > is actually old. I am seeing it on all of torvalds/linux-next/mmotm,
>> > some commits if it matters: 7a308bb3016f57e5be11a677d15b821536419d36,
>> > 5cf7a0f3442b2312326c39f571d637669a478235,
>> > c497f8d17246720afe680ea1a8fa6e48e75af852.
>> > Majority of reports points to net_drop_ns as the offending free, but
>> > it may be red herring. Since the access happens in timer, it can
>> > happen long after free and the memory could have been reused. I've
>> > also seen few where the access in tw_timer_handler is reported as
>> > out-of-bounds on task_struct and on struct filename.
>>
>>
>>
>> I've briefly skimmed through the code. Assuming that it requires a
>> very tricky race to be triggered, the most suspicious looks
>> inet_twsk_deschedule_put vs __inet_twsk_schedule:
>>
>> void inet_twsk_deschedule_put(struct inet_timewait_sock *tw)
>> {
>> if (del_timer_sync(&tw->tw_timer))
>> inet_twsk_kill(tw);
>> inet_twsk_put(tw);
>> }
>>
>> void __inet_twsk_schedule(struct inet_timewait_sock *tw, int timeo, bool rearm)
>> {
>> tw->tw_kill = timeo <= 4*HZ;
>> if (!rearm) {
>> BUG_ON(mod_timer(&tw->tw_timer, jiffies + timeo));
>> atomic_inc(&tw->tw_dr->tw_count);
>> } else {
>> mod_timer_pending(&tw->tw_timer, jiffies + timeo);
>> }
>> }
>>
>> Can't it somehow end up rearming already deleted timer? Or maybe the
>> first mod_timer happens after del_timer_sync?
>
> This code was changed a long time ago :
>
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ed2e923945892a8372ab70d2f61d364b0b6d9054
>
> So I suspect a recent patch broke the logic.
>
> You might start a bisection :
>
> I would check if 4.7 and 4.8 trigger the issue you noticed.


It happens with too low rate for bisecting (few times per day). I
could add some additional checks into code, but I don't know what
checks could be useful.