[PATCH 3.11 222/233] net: Fix memory leak if TPROXY used with TCP early demux

From: Luis Henriques
Date: Fri Feb 07 2014 - 06:57:44 EST


3.11.10.4 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Holger Eitzenberger <holger@xxxxxxxxxxxxxxxx>

commit a452ce345d63ddf92cd101e4196569f8718ad319 upstream.

I see a memory leak when using a transparent HTTP proxy using TPROXY
together with TCP early demux and Kernel v3.8.13.15 (Ubuntu stable):

unreferenced object 0xffff88008cba4a40 (size 1696):
comm "softirq", pid 0, jiffies 4294944115 (age 8907.520s)
hex dump (first 32 bytes):
0a e0 20 6a 40 04 1b 37 92 be 32 e2 e8 b4 00 00 .. j@..7..2.....
02 00 07 01 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff810b710a>] kmem_cache_alloc+0xad/0xb9
[<ffffffff81270185>] sk_prot_alloc+0x29/0xc5
[<ffffffff812702cf>] sk_clone_lock+0x14/0x283
[<ffffffff812aaf3a>] inet_csk_clone_lock+0xf/0x7b
[<ffffffff8129a893>] netlink_broadcast+0x14/0x16
[<ffffffff812c1573>] tcp_create_openreq_child+0x1b/0x4c3
[<ffffffff812c033e>] tcp_v4_syn_recv_sock+0x38/0x25d
[<ffffffff812c13e4>] tcp_check_req+0x25c/0x3d0
[<ffffffff812bf87a>] tcp_v4_do_rcv+0x287/0x40e
[<ffffffff812a08a7>] ip_route_input_noref+0x843/0xa55
[<ffffffff812bfeca>] tcp_v4_rcv+0x4c9/0x725
[<ffffffff812a26f4>] ip_local_deliver_finish+0xe9/0x154
[<ffffffff8127a927>] __netif_receive_skb+0x4b2/0x514
[<ffffffff8127aa77>] process_backlog+0xee/0x1c5
[<ffffffff8127c949>] net_rx_action+0xa7/0x200
[<ffffffff81209d86>] add_interrupt_randomness+0x39/0x157

But there are many more, resulting in the machine going OOM after some
days.

>From looking at the TPROXY code, and with help from Florian, I see
that the memory leak is introduced in tcp_v4_early_demux():

void tcp_v4_early_demux(struct sk_buff *skb)
{
/* ... */

iph = ip_hdr(skb);
th = tcp_hdr(skb);

if (th->doff < sizeof(struct tcphdr) / 4)
return;

sk = __inet_lookup_established(dev_net(skb->dev), &tcp_hashinfo,
iph->saddr, th->source,
iph->daddr, ntohs(th->dest),
skb->skb_iif);
if (sk) {
skb->sk = sk;

where the socket is assigned unconditionally to skb->sk, also bumping
the refcnt on it. This is problematic, because in our case the skb
has already a socket assigned in the TPROXY target. This then results
in the leak I see.

The very same issue seems to be with IPv6, but haven't tested.

Reviewed-by: Florian Westphal <fw@xxxxxxxxx>
Signed-off-by: Holger Eitzenberger <holger@xxxxxxxxxxxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
Signed-off-by: Luis Henriques <luis.henriques@xxxxxxxxxxxxx>
---
net/ipv4/ip_input.c | 2 +-
net/ipv6/ip6_input.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 15e3e68..0a22bb0 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -313,7 +313,7 @@ static int ip_rcv_finish(struct sk_buff *skb)
const struct iphdr *iph = ip_hdr(skb);
struct rtable *rt;

- if (sysctl_ip_early_demux && !skb_dst(skb)) {
+ if (sysctl_ip_early_demux && !skb_dst(skb) && skb->sk == NULL) {
const struct net_protocol *ipprot;
int protocol = iph->protocol;

diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index 2bab2aa..774b09c 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -49,7 +49,7 @@

int ip6_rcv_finish(struct sk_buff *skb)
{
- if (sysctl_ip_early_demux && !skb_dst(skb)) {
+ if (sysctl_ip_early_demux && !skb_dst(skb) && skb->sk == NULL) {
const struct inet6_protocol *ipprot;

ipprot = rcu_dereference(inet6_protos[ipv6_hdr(skb)->nexthdr]);
--
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/