Re: [PATCH] netfilter: fix dangling pointer access of fake_rtable

From: Rundong Ge
Date: Thu Apr 18 2019 - 05:58:51 EST


friendly ping

Rundong Ge <rdong.ge@xxxxxxxxx> ä2019å4æ9æåä äå2:56åéï
>
> With bridge-nf-call-iptables enabeled, Skbs go through the bridge
> and enqueued between <NF_BR_PRE_ROUTING,NF_BR_PRI_BRNF> and
> <NF_BR_FORWARD,NF_BR_PRI_BRNF - 1> won't be flushed when bridge is
> down. Then _skb_refdst of skbs in the nfqueue become dangling pointer.
>
> Reproduce steps:
> 1.Create a bridge on the box.
> 2.echo 1 >/proc/sys/net/bridge/bridge-nf-call-iptables
> 3.Add a netfilter hook function to queue the packets to nfqueuenum 0.
> The hook point must between <NF_BR_PRE_ROUTING,NF_BR_PRI_BRNF> and
> <NF_BR_FORWARD,NF_BR_PRI_BRNF - 1>.
> 4.Add a userspace process "nfqueue_rcv" to continuously read and
> set_verdict "NF_ACCEPT" to packets from queue 0.
> 5.Continuosly ping client1 from client0
> 6.Send "Ctrl + Z" to pause the "nfqueue_rcv" to simulate the queue
> congestion.
> 7.Using "ifconfig br0 down&&brctl delbr br0" to delete the bridge.
> 8.At this time the _skb_refdst of skbs in the nfqueue become dangling
> pointer. If we send "fg" to resume the "nfqueue_rcv", the kernel
> may try to access the freed memory.
>
> Debug log:
> Here I add debug logs in "netdev_freemem" and "dst_release" to prove
> the freed memory access. As the log shows, the "dst_release" accessed
> bridge's fake_rtable after the bridge was freed.
>
> Apr 8 22:25:14 raydon kernel: [62139.005062] netdev_freemem name:br0,
> fake_rtable:000000009d76cef0
>
> Apr 8 22:25:21 raydon kernel: [62145.967133] dst_release
> dst:000000009d76cef0 dst->dev->name: ÅKUÂTH
>
> Apr 8 22:25:21 raydon kernel: [62145.967154] dst_release
> dst:000000009d76cef0 dst->dev->name: ÅKUÂTH
>
> Apr 8 22:25:21 raydon kernel: [62145.967180] dst_release
> dst:000000009d76cef0 dst->dev->name: ÅKUÂTH
>
> Apr 8 22:25:21 raydon kernel: [62145.967197] dst_release
> dst:000000009d76cef0 dst->dev->name: ÅKUÂTH
>
> The reason why the hook point should be after <NF_BR_PRE_ROUTING,
> NF_BR_PRI_BRNF> is skbs reference bridge's fake_rtable in
> "br_nf_pre_routing_finish" hooked at <NF_BR_PRE_ROUTING,NF_BR_PRI_BRNF>.
>
> And the reason why the hook point should be before <NF_BR_FORWARD,
> NF_BR_PRI_BRNF - 1> is "br_nf_forward_ip" will set the state.out to
> bridge dev. After this hook point, the "nfqnl_dev_drop" triggered by
> the bridge's NETDEV_DOWN event can flush the queued skbs before
> bridge's memory is freed, because the state.out now matches the
> bridge's dev.
>
> Signed-off-by: Rundong Ge <rdong.ge@xxxxxxxxx>
> ---
> net/netfilter/nfnetlink_queue.c | 24 ++++++++++++++++++------
> 1 file changed, 18 insertions(+), 6 deletions(-)
>
> diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
> index 0dcc359..57eb02d 100644
> --- a/net/netfilter/nfnetlink_queue.c
> +++ b/net/netfilter/nfnetlink_queue.c
> @@ -905,13 +905,25 @@ static void free_entry(struct nf_queue_entry *entry)
> dev_cmp(struct nf_queue_entry *entry, unsigned long ifindex)
> {
> #if IS_ENABLED(CONFIG_BRIDGE_NETFILTER)
> - int physinif, physoutif;
> + struct net_device *physindev, *physoutdev;
> + struct net_bridge_port *port;
>
> - physinif = nf_bridge_get_physinif(entry->skb);
> - physoutif = nf_bridge_get_physoutif(entry->skb);
> -
> - if (physinif == ifindex || physoutif == ifindex)
> - return 1;
> + physindev = nf_bridge_get_physindev(entry->skb);
> + physoutdev = nf_bridge_get_physoutdev(entry->skb);
> + if (physindev) {
> + if (physindev->ifindex == ifindex)
> + return 1;
> + port = br_port_get_rcu(physindev);
> + if (port && port->br->dev->ifindex == ifindex)
> + return 1;
> + }
> + if (physoutdev) {
> + if (physoutdev->ifindex == ifindex)
> + return 1;
> + port = br_port_get_rcu(physoutdev);
> + if (port && port->br->dev->ifindex == ifindex)
> + return 1;
> + }
> #endif
> if (entry->state.in)
> if (entry->state.in->ifindex == ifindex)
> --
> 1.8.3.1
>