Re: Possible regression with skb_clone() in 2.6.36

From: Gustavo F. Padovan
Date: Wed Sep 15 2010 - 14:42:49 EST


Hi Mat,

* Gustavo F. Padovan <padovan@xxxxxxxxxxxxxx> [2010-09-10 16:45:09 -0300]:

> Hi Mat,
>
> * Mat Martineau <mathewm@xxxxxxxxxxxxxx> [2010-09-10 09:53:31 -0700]:
>
> >
> > Gustavo -
> >
> > I'm not sure why the streaming code used to work, but this does not
> > look like an skb_clone() problem. Your patch to remove the
> > skb_clone() call in l2cap_streaming_send() addresses the root cause of
> > this crash.
> >
> > On Wed, 8 Sep 2010, Gustavo F. Padovan wrote:
> >
> > > I've been experiencing some problems when running the L2CAP Streaming mode in
> > > 2.6.36. The system quickly runs in an Out Of Memory condition and crash. That
> > > wasn't happening before, so I think we may have a regression here (I didn't
> > > find where yet). The crash log is below.
> > >
> > > The following patch does not fix the regression, but shows that removing the
> > > skb_clone() call from l2cap_streaming_send() we workaround the problem. The
> > > patch is good anyway because it saves memory and time.
> > >
> > > By now I have no idea on how to fix this.
> > >
> > > <snip>
> >
> > This has to do with the sk->sk_wmem_alloc accounting that controls the
> > amount of write buffer space used on the socket.
> >
> > When the L2CAP streaming mode socket segments its data, it allocates
> > memory using sock_alloc_send_skb() (via bt_skb_send_alloc()). Before
> > that allocation call returns, skb_set_owner_w() is called on the new
> > skb. This adds to sk->sk_wmem_alloc and sets skb->destructor so that
> > sk->sk_wmem_alloc is correctly updated when the skb is freed.
> >
> > When that skb is cloned, the clone is not "owned" by the write buffer.
> > The clone's destructor is set to NULL in __skb_clone(). The version
> > of l2cap_streaming_send() that runs out of memory is passing the
> > non-owned skb clone down to the HCI layer. The original skb (the one
> > that's "owned by w") is immediately freed, which adjusts
> > sk->sk_wmem_alloc back down - the socket thinks it has unlimited write
> > buffer space. As a result, bt_skb_send_alloc() never blocks waiting
> > for buffer space (or returns EAGAIN for nonblocking writes) and the
> > HCI send queue keeps growing.
>
> If the problem is what you are saying, add a skb_set_owner_w(skb, sk) on
> the cloned skb should solve the problem, but it doesn't. That's exactly
> what tcp_transmit_skb() does. Also that just appeared in 2.6.36, is was
> working fine before, i.e, we have a regression here.

I've run some other tests and what you said also fixes the problem for
Streaming Mode. If I use skb_set_owner_w() on the cloned skb, everything
works fine. But we still have the problem for ERTM as I described.
send() blocks wainting for memory. The regression is there yet.
:(

>
> >
> > This isn't a problem for the ERTM sends, because the original skbs are
> > kept in the ERTM tx queue until they are acked. Once they're acked,
> > the write buffer space is freed and additional skbs can be allocated.
>
> It affects ERTM as well, but in that case the kernel doesn't crash
> because ERTM block on sending trying to allocate memory. Then we are not
> able to receive any ack (everything stays queued in sk_backlog_queue as
> the sk is owned by the user) and ERTM stalls.
>

--
Gustavo F. Padovan
ProFUSION embedded systems - http://profusion.mobi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/