Re: Analyzed/Solved/Bisected: Booting 2.6.30-rc2-git7 very slow

From: jim owens
Date: Sat Jun 20 2009 - 12:37:51 EST


Martin Knoblauch wrote:
----- Original Message ----

From: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx>
To: Martin Knoblauch <knobi@xxxxxxxxxxxx>
Cc: Kay Sievers <kay.sievers@xxxxxxxx>; Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; efault@xxxxxx; viro@xxxxxxxxxxxxxxxxxx; rjw@xxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; shemminger@xxxxxxxxxx; matthew@xxxxxx; mike.miller@xxxxxx
Sent: Tuesday, June 16, 2009 9:25:47 PM
Subject: Re: Analyzed/Solved/Bisected: Booting 2.6.30-rc2-git7 very slow

On Thu, 28 May 2009 02:14:46 -0700 (PDT)
Martin Knoblauch wrote:
I expect the duplicate comes from a left-over mount in initramfs
which isn't a duplicate in the sense of a bug in vfs or mount or
anything. I guess, it is just still mounted in the initial kernel
rootfs, below the root from the disk. It could be that a umount
from initramfs did go wrong because of a changed timing.

This is what I suspect as well. I know for sure that the first
sysfs-line in /proc/mounts

| none /sys sysfs rw 0 0

is already there (2.6.29-rc1 and up) when entering startup-skripts.
It is supposed to be unmounted before, but something seems to prevent
it. I have idea how to capture debug output from the initrd/init
script :-(
What's the latest here Martin? It sounded like this was a userspace
issue, with something reading the VPD over and over? Or was it just a
longer timeout that caused a specific driver to slow everything down?


Not sure about the VPD thing. Anyway, no real news. Still happens in 2.6.30. But it only happens on a certain HW platform (HP/DL380G4). The folks at HP try to reproduce in their environment.

Cheers
Martin

I reproduced this and verified Martin's analysis. Conclusions:

- >>> | none /sys sysfs rw 0 0

is because the initrd "umount /sys" fails with EBUSY

|commit 1120f8b8169fb2cb51219d326892d963e762edb6
|Author: Stephen Hemminger <shemminger@xxxxxxxxxx>
|Date: Thu Dec 18 09:17:16 2008 -0800
|
| PCI: handle long delays in VPD access

does not have a bug. The longer timeout makes the problem visible.

/sys is busy because udev is trying to read the vpd and the
cciss pci device always fails the vpd with ETIMEOUT. If all
timeouts are before or after the umount, no firmware load problem.

IMO there is either a vpd read bug on this platform or it is
unsupported and ETIMEOUT is the wrong error.

... now I punt this to the HP platform/driver people.

jim

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/