>On Fri, Aug 02, 2002 at 04:54:02PM -0700, Maksim (Max) Krasnyanskiy wrote:
> > You're fixing the wrong problem.
>Probably I am. Alexey Kuznietzov told be the same :-)
>
> >It seems that some subsystem is not releasing
> > tun device during shutdown/deregistration. (See comment in
> > net/core/dev.c:unregister_netdev).
> > You're not gonna see "waiting for" warning anymore if you change to new
> > style allocation.
>I will not see "waiting for" warning, but I will also be able to control
>all other network devices. Without this "fix" I am not able to shutdown
>network at all. Every "ip" command just hangs forever.
Yeah, this should be fixed. unregister_netdevice() sleeps under rtnl_lock().
Which means that any other activity that needs this lock will be blocked.
Dave, how about this
--- net/core/dev.c.orig Mon Aug 5 21:48:54 2002
+++ net/core/dev.c Mon Aug 5 21:54:01 2002
@@ -2577,6 +2577,11 @@
*/
+ /* We don't have to hold rtnl semaphore while we're waiting for
+ device to become free.
+ */
+ rtnl_unlock();
+
now = warning_time = jiffies;
while (atomic_read(&dev->refcnt) != 1) {
if ((jiffies - now) > 1*HZ) {
@@ -2593,6 +2598,10 @@
warning_time = jiffies;
}
}
+
+ /* Our caller expects it to be locked */
+ rtnl_lock();
+
dev_put(dev);
return 0;
}
---- We don't have to hold rtnl look while sleeping. Device is already unlinked from the list so nobody can grab and bump refcount.> > But you're gonna leak tun devices because destructor is not called unless > > refcount is zero. >But it seems it is eventually called. The refcount eventually goes to 0 >(1 in factm - selfreference). Without this patch it never went to 0, as >system shutdown was stopped "waitnig for...". It'd be nice to trace what part of the kernel is actually holding refcount.
>I don't have enough time and probably knowledge to write a proper fix >for the problem, however my patch fixes it well enough for me. >All this "new style" device allocation would be useless if the kernel >was bug-free. With this patch it is more bug-proof. :) No, the point of device destructors is not to hide kernel bugs.
>On protuction system it is sometimes even more important, because kernel >(or any other big piece of software) will never be bug-free. That why it's important to trace and fix the subsystem that is holding devices for a long time after deregistration. I may very well be doing it for a good reason but warning is helpful anyway. And we should fix sleep in unregister_netdevice() (ie patch above).
Max
- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Wed Aug 07 2002 - 22:00:31 EST