Re: net: suspicious RCU usage in nf_hook

From: Cong Wang
Date: Tue Jan 31 2017 - 01:22:27 EST

On Fri, Jan 27, 2017 at 5:31 PM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
> On Fri, 2017-01-27 at 17:00 -0800, Cong Wang wrote:
>> On Fri, Jan 27, 2017 at 3:35 PM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
>> > Oh well, I forgot to submit the official patch I think, Jan 9th.
>> >
>> >!topic/syzkaller/BhyN5OFd7sQ
>> >
>> Hmm, but why only fragments need skb_orphan()? It seems like
>> any kfree_skb() inside a nf hook needs to have a preceding
>> skb_orphan().
>> Also, I am not convinced it is similar to commit 8282f27449bf15548
>> which is on RX path.
> Well, we clearly see IPv6 reassembly being part of the equation in both
> cases.

Yeah, of course. My worry is that this problem is more than just
IPv6 reassembly.

> I was replying to first part of the splat [1], which was already
> diagnosed and had a non official patch.
> use after free is also a bug, regardless of jump label being used or
> not.
> I still do not really understand this nf_hook issue, I thought we were
> disabling BH in netfilter.

It is a different warning from use-after-free, this one is about sleep
in atomic context, mutex lock is acquired with RCU read lock held.

> So the in_interrupt() check in net_disable_timestamp() should trigger,
> this was the intent of netstamp_needed_deferred existence.
> Not sure if we can test for rcu_read_lock() as well.

The context is process context (TX path before hitting qdisc), and
BH is not disabled, so in_interrupt() doesn't catch it. Hmm, this
makes me thinking maybe we really need to disable BH in this
case for nf_hook()? But it is called in RX path too, and BH is
already disabled there.