Re: Problems with Tulip under heavy load

Oren Laadan (orenl@cs.huji.ac.il)
Tue, 10 Nov 1998 19:21:11 +0200 (IST)


Hi,

> > We're getting frequent network lockups with kernels 2.1.120 -- 2.1.126.
> > All of a sudden the machine doesn't respond to the network AT ALL, as
> > if the cable was cut. In fact the machine is totally alive and everything
> > works inside - scheduling, and all.

> > After rebooting, I could see once message in the log file:
> > "eth0: Too much work at interrupt, csr5=0xfc230040"

> > After which more messages appear describing problems with the network
> > (NFS server not responding etc). This is fairly consistent, and happened
> > on over 20 different nodes, repeatedly !

> > Lastly - the hardware was detected at boot as:

> > Nov 4 12:51:21 mos61 kernel: eth0: Digital DS21140 Tulip at 0x8000, 00 00 c0 43 18 e7, IRQ 19.
> > Nov 4 12:51:21 mos61 kernel: eth0: Old format EEPROM on 'SMC9332DST' board. Using substitute media control info.
> > Nov 4 12:51:21 mos61 kernel: eth0: EEPROM default media type Autosense.
> > Nov 4 12:51:21 mos61 kernel: eth0: Index #0 - Media 10baseT (#0) described by a 21140 non-MII (0) block.
> > Nov 4 12:51:21 mos61 kernel: eth0: Index #1 - Media 100baseTx (#3) described by a 21140 non-MII (0) block.

> Try the de4x5 driver, it seems more reliable than the tulip driver, i.e.
> doesn't drop the network twice a day.

Unfortunately this driver doesn't work correctly - and the system ends
up panic()ing.

The card is SMC-9332 from the year 1994 (I don't know if it has a new/old
SROM). The driver docs say it should be supported if the SROM is recent...
The thing is -- we have 80 of these cards :-)

I get the following messages at boot time:

---------------------------------------------------------------------------

MII device address: 1
MII CR: 3fff
MII SR: ffff
MII ID0: 3fff
MII ID1: ffff
MII ANA: 3fff
MII ANC: ffff
MII 16: 3fff
MII 17: ffff
MII 18: 3fff

eth0: Using generic MII device control. If the board doesn't operate,
please mail the following dump to the author:

MII device address: 2
MII CR: 3fff
MII SR: ffff
MII ID0: 3fff
MII ID1: ffff
MII ANA: 3fff
MII ANC: ffff
MII 16: 3fff
MII 17: ffff
MII 18: 3fff

eth0: Using generic MII device control. If the board doesn't operate,
please mail the following dump to the author:

MII device address: 3
MII CR: 3fff
MII SR: ffff
MII ID0: 3fff
MII ID1: ffff
MII ANA: 3fff
MII ANC: ffff
MII 16: 3fff
MII 17: ffff
MII 18: 3fff

eth0: Using generic MII device control. If the board doesn't operate,
please mail the following dump to the author:

MII device address: 4
MII CR: 3fff
MII SR: ffff
MII ID0: 3fff
MII ID1: ffff
MII ANA: 3fff
MII ANC: ffff
MII 16: 3fff
MII 17: ffff
MII 18: 3fff
eth0: Using generic MII device control. If the board doesn't operate,
please mail the following dump to the author:

MII device address: 5
MII CR: 3fff
MII SR: ffff
MII ID0: 3fff
MII ID1: ffff
MII ANA: 3fff
MII ANC: ffff
MII 16: 3fff
MII 17: ffff
MII 18: 3fff

eth0: Using generic MII device control. If the board doesn't operate,
please mail the following dump to the author:

MII device address: 6
MII CR: 3fff
MII SR: ffff
MII ID0: 3fff
MII ID1: ffff
MII ANA: 3fff
MII ANC: ffff
MII 16: 3fff
MII 17: ffff
MII 18: 3fff

eth0: Using generic MII device control. If the board doesn't operate,
please mail the following dump to the author:

MII device address: 7
MII CR: 3fff
MII SR: ffff
MII ID0: 3fff
MII ID1: ffff
MII ANA: 3fff
MII ANC: ffff
MII 16: 3fff
MII 17: ffff
MII 18: 3fff

eth0: Using generic MII device control. If the board doesn't operate,
please mail the following dump to the author:

MII device address: 8
MII CR: 3fff
MII SR: ffff
MII ID0: 3fff
MII ID1: ffff
MII ANA: 3fff
MII ANC: ffff
MII 16: 3fff
MII 17: ffff
MII 18: 3fff

Unable to handle kernel NULL pointer dereference at virtual address 0000011f
current->tss.cr3 = 00101000, %cr3 = 00101000
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c01eab44>]
EFLAGS: 00010202
eax: 00009100 ebx: c045f000 ecx: ffffffff edx: 0000011f
esi: 0000011f edi: c045f000 ebp: c0099e7c esp: c0099e6c
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 1, process nr: 1, stackpage=c0099000)
Stack: c0247494 00000023 c0247494 00009100 c0099ea0 c01e8ebc c0247494 0000011f
c045f000 c0247494 c045f000 c022369d 00000000 c0099ed4 c01ebb01 c0247494
c045f000 c0247494 00000012 00000007 00003fff 00000000 00000003 00000009

Well... you got the point...

any ideas ?

Oren.

__________________________________________________________________________
______ ____ ___ ___ _ __ \
MOSIX Development Group ) ) ) ) ) ( ' ) \ / Oren Laadan \
The Hebrew University / / / / / \ / / orenl@cs.huji.ac.il \
of Jerusalem, Israel ( ( (___( ___) _(_ __/ \_______________________)

http://www.mosix.cs.huji.ac.il

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/