From: BjÃrn Mork [mailto:bjorn@xxxxxxx]Believe it or not we actually do test these changes. This one was tested by me and I did not have the same results you and the other people reporting this trace did. I made it back in the lab today and have spent a good part of the day attempting to reproduce this bug without success. Freeze / resume works for me on all the systems I have tried, which includes a sampling of all the current parts and many older ones. Given there are several other reports of this it is obviously an issue and I would like to be able to reproduce it in case another patch to resolve the issue this attempts to fix comes back in another form. So I want to know what's different between the systems that hit this and my bank of systems that don't.
Sent: Monday, March 13, 2017 9:46 AM
To: Borislav Petkov <bp@xxxxxxxxx>
Cc: Andy Shevchenko <andy.shevchenko@xxxxxxxxx>; lkml@xxxxxxxxxxx;
linux-kernel <linux-kernel@xxxxxxxxxxxxxxx>; vcaputo@xxxxxxxxxxx; linux-
pci@xxxxxxxxxxxxxxx; intel-wired-lan@xxxxxxxxxxxxxxxx; khalidm
<khalidm@xxxxxxxxx>; David Singleton <davsingl@xxxxxxxxx>; Brown, Aaron
F <aaron.f.brown@xxxxxxxxx>; Kirsher, Jeffrey T
<jeffrey.t.kirsher@xxxxxxxxx>
Subject: Re: [BUG] 4.11.0-rc1 panic on shutdown X61s
Borislav Petkov <bp@xxxxxxxxx> writes:
On Sun, Mar 12, 2017 at 03:55:08PM +0200, Andy Shevchenko wrote:rc1 is this:
The only change that IMHO matters happened between v4.10 and v4.11-
Already did that a week ago:@@ -6276,8 +6274,8 @@ static int e1000e_pm_freeze(struct device *dev)Well, lemme add the people from
/* Quiesce the device without resetting the hardware */
e1000e_down(adapter, false);
e1000_free_irq(adapter);
+ e1000e_reset_interrupt_capability(adapter);
}
- e1000e_reset_interrupt_capability(adapter);
So, it apparently misses something for the other case, like
pci_disable_msi() call or so.
7e54d9d063fa ("e1000e: driver trying to free already-free irq")
to CC then. :-)
https://www.spinics.net/lists/netdev/msg423379.html
Haven't heard anything back yet. Wondering if they are waiting for
someone else to submit the pretty obvious revert? Don't understand why
that should take more than a minute to figure out. It's not like they
are testing these changes anyway...
What exact part (or parts) are we looking at (lspci|grep -i eth) that trigger this? Could it be a difference in .config files? The trace says it is falling back to legacy interrupts, does the system continue to work and does the network continue to function in that mode? In case it's related to user space what is the base distro? Any other information you think can help me reproduce the issue would be appreciated.
Thanks,
Aaron
_______________________________________________
BjÃrn
Intel-wired-lan mailing list
Intel-wired-lan@xxxxxxxxxxxxxxxx
http://lists.osuosl.org/mailman/listinfo/intel-wired-lan