Ayaz Abdulla a écrit :<BCC linux-kernel>
Eric Dumazet wrote:
Eric Dumazet a écrit :
Ingo Molnar a écrit :
The following changes since commit
52989765629e7d182b4f146050ebba0abf2cb0b7:
Linus Torvalds (1):
Merge git://git.kernel.org/.../davem/net-2.6
are available in the git repository at:
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master
Hm, something in this lot quickly wrecked networking here - see the
tx timeout dump below. It starts with:
[ 351.004596] WARNING: at net/sched/sch_generic.c:246
dev_watchdog+0x10b/0x19c()
[ 351.011815] Hardware name: System Product Name
[ 351.016220] NETDEV WATCHDOG: eth0 (forcedeth): transmit queue 0
timed out
Config attached. Unfortunately i've got no time to do bisection today.
forcedeth might have a problem, in its netif_wake_queue() logic, but
I could not see why a recent patch could make this problem visible now.
CPU0/1: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ stepping 02
is not a new cpu either :)
forcedeth uses an internal tx_stop without appropriate barrier.
Could you try following patch ?
(random guess as I dont have much time right now)
Oh well this patch was soooo stupid, sorry Ingo.
We might have a race in napi_schedule(), leaving interrupts disabled
forever.
I cannot test this patch, I dont have the hardware...
Thanks
diff --git a/drivers/net/forcedeth.c b/drivers/net/forcedeth.c
index 1094d29..3b4e076 100644
--- a/drivers/net/forcedeth.c
+++ b/drivers/net/forcedeth.c
@@ -3514,11 +3514,13 @@ static irqreturn_t nv_nic_irq(int foo, void
*data)
nv_msi_workaround(np);
#ifdef CONFIG_FORCEDETH_NAPI
- napi_schedule(&np->napi);
-
- /* Disable furthur irq's
- (msix not enabled with napi) */
- writel(0, base + NvRegIrqMask);
+ if (napi_schedule_prep(&np->napi)) {
+ /*
+ * Disable further irq's (msix not enabled with napi)
+ */
+ writel(0, base + NvRegIrqMask);
+ __napi_schedule(&np->napi);
+ }
Yes, good catch. There is a race condition here with napi poll.
I would prefer to do the following to keep the code simple and clean.
writel(0, base + NvRegIrqMask);
napi_schedule(&np->napi);
CC trimmed down to network devs only :)
It would be racy too ...
check drivers/net/amd8111e.c, drivers/net/natsemi.c ...
If this cpu inconditionaly calls writel(0, base + NvRegIrqMask); while another cpu just called writel(np->irqmask, base + NvRegIrqMask),
we end with disabled interrupts ?
--