Nasty bug in the IrDA stack

From: Jean Tourrilhes (jt@bougret.hpl.hp.com)
Date: Fri Mar 02 2001 - 20:36:20 EST


        Hi,

        I've found a really convoluted bug in the IrDA stack (spend
the week chasing it). As it is not trivial, I would like you to check
and comment on my description and my fix.
        My patch definitely fix the problem on the PC where I was
seeing it, and I can't crash it anymore (whereas it was very
reproducible before).

        The crash :
-----------------------
-> irda_release(); // Close IrDA socket
<- irda_release()
-> irlap_update_nr_received(); // Receive IrLAP ack
-> kfree_skb(); // Free acked packet
-> sock_wfree();
-> sock_def_write_space(); // Increase socket write buffer
-> __wake_up(); // Wake up writer
**** Crash ****
------------------------

        What I believe is happening :
        The socket is closing. We remove all the packet associated
with the socket instance (in IrTTP and the socket layer).
        Unfortunately, there are packets in IrLAP, waiting for
retransmission, and as IrLAP doesn't know about sockets, those packets
are not freed at this point.
        When IrLAP try to free those packets, they refer to a socket
that no longer exist, and we go into the cosmos.

        In fact, it was worse. The packet still had a reference to the
socket when we were calling dev_queue_xmit(skb), and as the device
layer can also queue skbs, the same problem can happen (even if it
would be much more difficult to trigger).

        The fix :
        skb_clone() and skb_free() the packet before passing them to
the LAP layer. This way, they don't refer any socket.

        Alan :
        Is there in the kernel a function doing the equivalent of the
following but optimised (avoiding malloc/free of the skb structure) :
-----------------------
        skb2 = skb_clone(skb1);
        kfree_skb(skb1);
        skb1 = skb2;
-----------------------
        The point there is to remove the reference to skb->sk, so I
want something that do the opposite of skb_set_owner_w(). Something I
would call skb_set_owner_none()...

        Thanks, and good week end...

        Jean

P.S. : I also removed all the skb_set_owner_w() in IrLAP, they are now
totally useless.
P.S.2 : Dag : Do not put this patch yet in the kernel, we need a bit
more testing and comments...



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Mar 07 2001 - 21:00:13 EST