Re: 2.3.15: ominous message from unregister_netdevice...

kuznet@ms2.inr.ac.ru
Sat, 28 Aug 1999 19:41:30 +0400 (MSK DST)


Hello!

> It hasn't crashed (yet) and if you get this message then re-connecting
> via PPP isn't that harmful... nb, pppd is 2.3.9. gcc etc. in .sig.

The patch is appended. Do not wait for crash, apply it 8)

Alexey Kuznetsov

diff -ur ../vger3-990823/linux/net/core/dev.c linux/net/core/dev.c
--- ../vger3-990823/linux/net/core/dev.c Fri Aug 20 21:39:27 1999
+++ linux/net/core/dev.c Thu Aug 26 21:23:52 1999
@@ -1933,8 +1808,9 @@
return 0;
}

- if (atomic_dec_and_test(&dev->refcnt)) {
- netdev_finish_unregister(dev);
+ /* Last reference is our one */
+ if (atomic_read(&dev->refcnt) == 1) {
+ dev_put(dev);
return 0;
}

@@ -1942,16 +1818,46 @@
printk("unregister_netdevice: waiting %s refcnt=%d\n", dev->name, atomic_read(&dev->refcnt));
#endif

+ /* EXPLANATION. If dev->refcnt is not 1 now (1 is our own reference)
+ it means that someone in the kernel still has reference
+ to this device and we cannot release it.
+
+ "New style" devices have destructors, hence we can return from this
+ function and destructor will do all the work later.
+
+ "Old style" devices expect that device is free of any references
+ upon exit from this function. WE CANNOT MAKE such release
+ without delay. Note that it is not new feature. Referencing devices
+ after they are released occured in 2.0 and 2.2.
+ Now we just can know about each fact of illegal usage.
+
+ So, we linger for 10*HZ (it is an arbitrary number)
+
+ After 1 second, we start to rebroadcast unregister notifications
+ in hope that careless clients will release the device.
+
+ If timeout expired, we have no choice how to cross fingers
+ and return. Real alternative would be block here forever
+ and we will make it eventually, when all peaceful citizens
+ will be notified and repaired.
+ */
+
now = jiffies;
- while (atomic_read(&dev->refcnt)) {
- schedule_timeout(HZ/10);
+ while (atomic_read(&dev->refcnt) != 1) {
+ if ((jiffies - now) > 1*HZ) {
+ /* Rebroadcast unregister notification */
+ notifier_call_chain(&netdev_chain, NETDEV_UNREGISTER, dev);
+ }
+ current->state = TASK_INTERRUPTIBLE;
+ schedule_timeout(HZ/4);
+ current->state = TASK_RUNNING;
if ((jiffies - now) > 10*HZ)
break;
}

- if (atomic_read(&dev->refcnt))
- printk("unregister_netdevice: Old style device %s leaked(refcnt=%d). Wait for crash.\n", dev->name, atomic_read(&dev->refcnt));
-
+ if (atomic_read(&dev->refcnt) != 1)
+ printk("unregister_netdevice: Old style device %s leaked(refcnt=%d). Wait for crash.\n", dev->name, atomic_read(&dev->refcnt)-1);
+ dev_put(dev);
return 0;
}

diff -ur ../vger3-990823/linux/net/core/dst.c linux/net/core/dst.c
--- ../vger3-990823/linux/net/core/dst.c Fri Aug 20 21:39:30 1999
+++ linux/net/core/dst.c Thu Aug 26 21:23:51 1999
@@ -174,8 +174,27 @@
spin_lock_bh(&dst_lock);
for (dst = dst_garbage_list; dst; dst = dst->next) {
if (dst->dev == dev) {
- dst->input = dst_discard;
- dst->output = dst_blackhole;
+ /* Dirty hack. We did it in 2.2 (in __dst_free),
+ we have _very_ good reasons not to repeat
+ this mistake in 2.3, but we have no choice
+ now. _It_ _is_ _explicit_ _deliberate_
+ _race_ _condition_.
+ */
+ if (event!=NETDEV_DOWN && !dev->new_style &&
+ dst->output == dst_blackhole) {
+ dst->dev = &loopback_dev;
+ dev_put(dev);
+ dev_hold(&loopback_dev);
+ dst->output = dst_discard;
+ if (dst->neighbour && dst->neighbour->dev == dev) {
+ dst->neighbour->dev = &loopback_dev;
+ dev_put(dev);
+ dev_hold(&loopback_dev);
+ }
+ } else {
+ dst->input = dst_discard;
+ dst->output = dst_blackhole;
+ }
}
}
spin_unlock_bh(&dst_lock);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/