RE: No link on e1000e with 2.6.35.3 and ThinkPad T60

From: Tantilov, Emil S
Date: Tue Sep 07 2010 - 15:05:57 EST


Marc Haber wrote:
> Hi,
>
> On Thu, Aug 26, 2010 at 03:31:07PM -0600, Tantilov, Emil S wrote:
>> Marc Haber wrote:
>>> On Thu, Aug 26, 2010 at 01:27:32PM +0200, Marc Haber wrote:
>>>> On Wed, Aug 25, 2010 at 12:01:40PM -0600, Tantilov, Emil S wrote:
>>>>> We have a tool that you can use to dump the registers from the
>>>>> port in the failed state:
>>>>> http://sourceforge.net/projects/e1000/files/Ethregs%20-%20Register%20Dump%20Tool/ethregs-1.12.3/
>>>>
>>>> Attached, for an interface in the working state. I'll deliver
>>>> another regdump when I see the issue again
>>>
>>> Here we go, with a non-working interface.
>>
>> After reviewing the output from dmidecode I determined that your
>> model T60 is slightly different than mine. It appears that you have
>> the widescreen version. Is that correct?
>
> The dmidecode output is from the widescreen model, yes, but I also
> have two "normal" T60 with the non-wide screen 15" display (with
> 1400x1050 pixels). The freezes happen on all three. The one I have at

That is good to know. So it seems the issue is not limited to just the widescreen model.

> hand is running BIOS 2.26 dated 2010-04-01. I will also try updating
> the Widescreen unit which is - not surprisingly - the one I use the
> most.
>
>> Also you seem to be running a fairly old version of the BIOS (1.08).
>> The latest is 1.18:
>> http://www-307.ibm.com/pc/support/site.wss/MIGR-67020.html
>
> Thanks for that pointer, I am having difficulties in navigating the
> IBM/Lenovo web sites.
>
>> I would recommend that you upgrade your BIOS. If that does not help
>> we can continue with the investigation. I will also try to locate a
>> widescreen T60 that would hopefully help me with the repro.
>
> I can give you ssh access to mine if you want to. Do you have IPv6
> connectivity? If you want me to, send me your ssh key.

Is your traffic mostly over IPv6? That maybe a clue as there has been a lot of changes in the IPv6 code in recent kernels.

>
>> The model I have has been running all kinds of stress since you first
>> reported this issue, and so far is rock solid.
>
> Please note that usually the freezes happen when the network is rather
> slightly loaded, for example when I'm typing into an ssh window with
> nothing else happening on the box. When I do things that are rather
> traffic intensive such as a backup, the box is fine. The "no link"
> issue appears most frequently on a system that has been running for
> some time and suspend-to-ram was used. I am traveling a lot, and every
> change of train or bus involves a suspend-resume cycle.

I had ran everything I can think of on mine. The system now has uptime of 11 days and I ran heavy stress, lite stress, no stress and suspend/resume multiple times without issues.

At this point we need to start looking into details. Especially if you can come up with some sort of test that would consistently reproduce the hang. I should be able to repro it on my system if it is a generic issue.

I would also repeat my request to open a bug at e1000.sf.net as there has been a lot of info exchanged in email and it will be easier to follow if it is in one place.

>
> Greetings
> Marc

Thanks,
Emil
N‹§²æìr¸›yúèšØb²X¬¶ÇvØ^–)Þ{.nÇ+‰·¥Š{±‘êçzX§¶›¡Ü}©ž²ÆzÚ&j:+v‰¨¾«‘êçzZ+€Ê+zf£¢·hšˆ§~†­†Ûiÿûàz¹®w¥¢¸?™¨è­Ú&¢)ßf”ù^jÇy§m…á@A«a¶Úÿ 0¶ìh®å’i