Re: 2.6.21-rc5-mm1

From: Matt Mackall
Date: Thu Mar 29 2007 - 14:35:30 EST


On Thu, Mar 29, 2007 at 08:01:40PM +0200, Mariusz KozÅowski wrote:
> Hello,
>
> > > > > I run 2.6.21-rc4-mm1 with no hangs for a week.
> > > > > Then when 2.6.21-rc5-mm1 showed up so I switched to it. Unfortunately
> > > > > today my laptop hunged twice in a similar way as described here:
> > > > >
> > > > > http://www.ussg.iu.edu/hypermail/linux/kernel/0703.0/index.html#1165
> > > >
> > > > It's not good that we went backwards between those two releases.
> > > >
> > > > > The difference is that it happened when I closed the lid in my laptop.
> > > > > When reopend it the box was frozen (ACPI?). Again disk I/O was dead
> > > > > so nothing was found in syslog.
> > > >
> > > > Adrian, does this look like any of the bugs whcih you're monitoring?
> > > >...
> > >
> > > Is it also present in 2.6.21-rc5?
> >
> > Don't know. I usualy test -mm series. Will test tommorow and let you know
> > after some reasonable uptime.
> >
> > > Is it also present with CONFIG_NO_HZ=n?
> >
> > Don't know. Did not try recently. Will let you know.
> >
> > It takes time as these hangs are not easy to trigger. With 2.6.21-rc2-mm1
> > it was easy -> push the system and watch it die in minutes. With
> > 2.6.21-rc5-mm1 it takes hours (3 hangs in ~15 hours) and not sure how to
> > trigger it. It just happens from time to time.
>
> Ok. CONGIG_NO_HZ=n and uptime ~12 hours, netconsole loaded, and no hangs
> ... until I moved my laptop. The same scenario happened yesterday during the
> last hang. I started playing and repeating some steps which I naturally do when
> I move my laptop.
>
> An hour later I came to this -> steps to hang 2.6.21-rc5-mm1 on my laptop:
> 1. boot the system and login as root
> 2. load netconsole (insmod netconole.ko netconsole=blah... blah...)
> 3. unplug the cable (link goes down)
> 4. system is frozen until forced to reboot
>
> This is verified and repeatable _every_ single time I tried. Unfortunately
> the last thing seen on the screen before system is frozen is 'eth0: link down'.
> So my guess is that when hunting for hangs I found something else that can hang
> my laptop (netconsole that is).

Which NIC do you have? Odds are that the driver is doing that printk
while holding a driver-internal spinlock that causes it to deadlock
when netpoll tries to send that message out.

BTW, I guess David Miller is the defacto netpoll maintainer these days
as most of the patches to it in the past year or two were not even
cc:ed to me.

--
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/