Re: [PATCH net-next ] Fix time-lag of IFF_RUNNING flag consistencybetween vlan and real devices

From: Eric Dumazet
Date: Mon Aug 29 2011 - 02:23:30 EST


Le dimanche 28 aoÃt 2011 Ã 23:06 -0700, Stephen Hemminger a Ãcrit :
>
> ----- Original Message -----
> > Le dimanche 28 aoÃt 2011 Ã 22:20 +0900, HAYASAKA Mitsuo a Ãcrit :
> > > Hi Stephen and Herbert
> > >
> > > Thank you for your comments.
> > >
> > > (2011/08/26 15:08), Stephen Hemminger wrote:
> > > > I don't think this is the right way to solve the problem.
> > > >
> > > > The flags are supposed to propagate back from real device to vlan
> > > > via network notifications.
> > > >
> > > > Just doing this for ioctl is not enough, API's other than user
> > > > space depend on this.
> > > > Also the user may have manually set different flags on vlan than
> > > > on
> > > > the real device.
> > >
> > > I agreed.
> > > I will try another way to solve this problem, as you said.
> > >
> > >
> > > (2011/08/26 15:45), Herbert Xu wrote:
> > > > On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger
> > > > wrote:
> > > >> Just doing this for ioctl is not enough, API's other than user
> > > >> space depend on this.
> > > >> Also the user may have manually set different flags on vlan than
> > > >> on
> > > >> the real device.
> > > > Right, anything that tests netif_carrier_ok directly on the VLAN
> > > > device will still be delayed.
> > > >
> > > > Now I remember discussing this issue in Japan. However, I can't
> > > > recall the exact scenario in which the delay occured.
> > > >
> > > > Is the issue with the link status going down on the real device,
> > > > or the real device coming up?
> > > >
> > > > IIRC we already have mechanisms in place to ensure that down
> > > > events
> > > > are not delayed by linkwatch. Of course it is possible that this
> > > > isn't working for some reason, or some other part of the system
> > > > is
> > > > causing the delay.
> > > >
> > > > So please clarify the scenario for us Hayasaka-san. Also please
> > > > let us know how you measured the delay.
> > > >
> > > > Thanks,
> > >
> > > This issue happens when the link status is going down on the real
> > > device.
> > >
> > > ex) A cable is broken, or is unplugged from a NIC.
> > >
> > > I measured the delay using ioctl with SIOCGIFFLAGS from userspace
> > > in order to check if there is a time-lag of the flag between vlan
> > > and real devices.
> > >
> > > Also, you can check it using a script below.
> > >
> > > -------------------------
> > > #!/bin/sh
> > > t=0
> > > while :
> > > do
> > > echo $t; t=$((t+1))
> > > echo -n real; ifconfig RealDev | grep UP
> > > echo -n vlan; ifconfig VlanDev | grep UP
> > > sleep 0.2
> > > done
> > > -------------------------
> > >
> > > The result is shown as follows.
> > > It is observed that there is a time-lag of RUNNING status between
> > > real and vlan devices.
> > >
> > >
> >
> > Hi !
> >
> > This reminds me some work done in linkwatch
> >
> > Please take a look at commit e014debecd3ee3832e647 (linkwatch:
> > linkwatch_forget_dev() to speedup device dismantle)
> >
> > And more generally, code in net/core/link_watch.c
>
> Maybe the problem is specific to a ethernet driver. Some devices poll
> for link changes, and also do a manual check when ioctl was done.
> This was mostly typical of older hardware that did not have a PHY
> interrupt.

Hmm, I just tried the script on my laptop, and reproduced the problem
with a tg3 driver, considered as a reference one ;)

the 'carrier is on' event is immediately present on both devices, but
the 'carrier is off' is delayed by one second.

09:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5755M
Gigabit Ethernet PCI Express (rev 02)
Subsystem: Dell Device 01f9
Flags: bus master, fast devsel, latency 0, IRQ 45
Memory at f1ef0000 (64-bit, non-prefetchable) [size=64K]
Expansion ROM at <ignored> [disabled]
Capabilities: <access denied>
Kernel driver in use: tg3
Kernel modules: tg3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/