Re: Nvidia MCP55 Machine reboots on ixgb driver load

From: Auke Kok
Date: Wed Jan 24 2007 - 17:07:36 EST


[added netdev to CC]

Roger Heflin wrote:
I have a machine (actually 2 machines) that upon loading
the intel 10GBe driver (ixgb) the machine reboots, I am
using a RHAS4.4 based distribution with Vanilla 2.6.19.2
(the RHAS 4.4.03 kernel also reboots with the ixgb load),
I don't see any messages on the machine before it reboots,
loading the driver with debug does not appear to produce
any extra messages. The basic steps are the I load
the driver, the machine locks up, and then in a second
or 2 it starts to post.

some suggestions immediately come to mind:

* have you tried the latest driver from http://e1000.sf.net/ ?

* what happens when you unplug the card and modprobe the ixgb module?

* have you tried capturing a trace with netconsole or serial console? probing the ixgb driver produces at least 1 syslog line that should show up. If nothing shows up on serial or netconsole, the issue may be way outside the ixgb driver.

I have tried the default ixgb driver in 2.6.19.2, and I
have tried the open source intel driver on RH4.4, both cause
the same reboot. I also tried the linux firmware
development kit, and booting fc6 causes the same reboot
upon the network driver load.

just for completeness, which driver versions were this?

and when you say "booting fc6" I assume you mean that fc6 boots OK until you modprobe the ixgb driver?

I have tried pci=nomsi

try compiling your kernel without MSI support alltogether. There have been serious issues found with MSI on certain configurations, and in your case this might be the cause. Allthough passing pci=nomsi should be the same as compiling it out, it can't hurt to try.

Also note that the linux.nics@xxxxxxxxx address is apparently
unused and returns an email telling you to use a web page,
and so far after using the web page I have not got any response
indicating that anything got to anyone.

I've taken action to get that straightened out. You're always welcome to mail netdev, the e1000 devel list or even lkml which we will all pick up on.


I can't think of a specific reason for the issue right now, other than attempting to get a serial/netconsole attached and trying without any msi support in the kernel. Please give that a try and let us know.

Auke
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/