Re: "new style" netdevice allocation patch for TUN driver (2.4.18 kernel)

From: Maksim (Max) Krasnyanskiy (maxk@qualcomm.com)
Date: Tue Aug 06 2002 - 12:07:49 EST


>On Fri, Aug 02, 2002 at 04:54:02PM -0700, Maksim (Max) Krasnyanskiy wrote:
> > You're fixing the wrong problem.
>Probably I am. Alexey Kuznietzov told be the same :-)
>
> >It seems that some subsystem is not releasing
> > tun device during shutdown/deregistration. (See comment in
> > net/core/dev.c:unregister_netdev).
> > You're not gonna see "waiting for" warning anymore if you change to new
> > style allocation.
>I will not see "waiting for" warning, but I will also be able to control
>all other network devices. Without this "fix" I am not able to shutdown
>network at all. Every "ip" command just hangs forever.
Yeah, this should be fixed. unregister_netdevice() sleeps under rtnl_lock().
Which means that any other activity that needs this lock will be blocked.

Dave, how about this

--- net/core/dev.c.orig Mon Aug 5 21:48:54 2002
+++ net/core/dev.c Mon Aug 5 21:54:01 2002
@@ -2577,6 +2577,11 @@

          */

+ /* We don't have to hold rtnl semaphore while we're waiting for
+ device to become free.
+ */
+ rtnl_unlock();
+
         now = warning_time = jiffies;
         while (atomic_read(&dev->refcnt) != 1) {
                 if ((jiffies - now) > 1*HZ) {
@@ -2593,6 +2598,10 @@
                         warning_time = jiffies;
                 }
         }
+
+ /* Our caller expects it to be locked */
+ rtnl_lock();
+
         dev_put(dev);
         return 0;
  }

----
We don't have to hold rtnl look while sleeping. Device is already unlinked
from the list so nobody can grab and bump refcount.

> > But you're gonna leak tun devices because destructor is not called unless > > refcount is zero. >But it seems it is eventually called. The refcount eventually goes to 0 >(1 in factm - selfreference). Without this patch it never went to 0, as >system shutdown was stopped "waitnig for...". It'd be nice to trace what part of the kernel is actually holding refcount.

>I don't have enough time and probably knowledge to write a proper fix >for the problem, however my patch fixes it well enough for me. >All this "new style" device allocation would be useless if the kernel >was bug-free. With this patch it is more bug-proof. :) No, the point of device destructors is not to hide kernel bugs.

>On protuction system it is sometimes even more important, because kernel >(or any other big piece of software) will never be bug-free. That why it's important to trace and fix the subsystem that is holding devices for a long time after deregistration. I may very well be doing it for a good reason but warning is helpful anyway. And we should fix sleep in unregister_netdevice() (ie patch above).

Max

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Aug 07 2002 - 22:00:31 EST