Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixedregressions (v2)

From: Alan Stern
Date: Mon Jan 15 2007 - 11:00:34 EST


On Sun, 14 Jan 2007, Florin Iucha wrote:

> Jiri and Trond,
>
> On Mon, Jan 15, 2007 at 01:14:09AM +0100, Jiri Kosina wrote:
> > On Sun, 14 Jan 2007, Florin Iucha wrote:
> >
> > > All the testing was done via a ssh into the workstation. The console
> > > was left as booted into, with the gdm running. The remote nfs4
> > > directory was mounted on "/mnt". After copying the 60+ GB and testing
> > > that the keyboard was still functioning, I did not reboot but stayed in
> > > the same kernel and pulled the latest git then started bisecting.
> >
> > Hi Florin,
> >
> > thanks a lot for the testing. Just to verify - what kernel is 'the same
> > kernel' mentioned above? (just to isolate whether the problem is really
> > somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts,
> > or the situation has changed).
>
> This happened with 2.6.19. It worked last time, but I wanted to test
> again, to make sure. This time, it bombed, but half an hour after the
> transfer finished.
>
> > > After recompiling, I moved over to the workstation to reboot it, but the
> > > keyboard was not functioning ;(
> >
> > So this time the hang occured when the system was idle, not during the
> > transfers, right?
>
> Yes it was idle. Immediately after the transfer finished, the keyboard was
> still functioning. It "hang" minutes later, after the first bisected kernel
> was compiled and installed.
>
> > > I ran "lsusb" and it displayed all the devices. "dmesg" did not show
> > > any oops, anything for that matter. I have unplugged the keyboard and
> > > run "lsusb" again, but it hang. I ran "ls /mnt" and it hang as well.
> > > Stracing "lsusb" showed it hang (entered the kernel) at opening the device
> > > that used to be the keyboard. Stracing "ls /mnt" showed that it
> > > hang at "stat(/mnt)". Both processes were in "D" state. "ls /root"
> > > worked without problem, so it appears that crossing mountpoints causes
> > > some hang in the kernel.
> >
> > Could you please do alt-sysrq-t (or "echo t > /proc/sysrq-trigger" via
> > ssh, when your keyboard is dead) to see the calltraces of the processes
> > which are stuck inside kernel?
> >
> > You will probably get a lot of output after the sysrq, so please either
> > put it somewhere on the web if possible, or just extract the interesting
> > processes out of it (mainly the ones which are stuck).
>
> Will do.

It would be nice to learn exactly why the keyboard stopped working. Try
using the usbmon facility (instructions in Documentation/usb/usbmon.txt)
to see what happens when you type on the dead keyboard. Be sure to turn
on CONFIG_USB_DEBUG as well. And also check /proc/interrupts; each time
you hit a key the USB controller should get an interrupt.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/