Re: scsi: aic7xxx hang since v2.6.28-rc1 ...

From: Ingo Molnar
Date: Wed Feb 18 2009 - 14:20:38 EST



* Mike Anderson <andmike@xxxxxxxxxxxxxxxxxx> wrote:

> Ingo Molnar <mingo@xxxxxxx> wrote:
> >
> > * Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > > I have no idea if this will make any difference for the
> > > problem you're seeing, but it has been submitted and it's
> > > worth trying out. If the problem still occurs, I'll write a
> > > diagnostic patch to add log messages giving the destiny of
> > > each request in scsi_io_completion().
> >
> > OK, i've undone the reverts and have applied your fix - it will
> > take a few hours to see whether the hang still occurs.
>
> I know already started your testing, but..

and that particular box already survived 20 test iterations in
the past few hours - while it would hang after 5-10 iterations
before. So i think Alan's fix is making a difference. I'll be
able to tell for sure tomorrow morning.

> I find it informative to set my scsi logging to the value
> below to display non-zero IO status on commands. The overhead
> impact is low for good completions.
>
> sysctl -w dev.scsi.logging_level=4100
>
> Note: This does not provide the exact policy that
> scsi_io_completion will take on the IO, but it provides the
> input to scsi_io_completion which should help.

will do that next time around i have a bug like this. (or if
this bug triggers again)

OTOH, the hang took quite a bit of IO to occur. Sometimes the
box would be able to build a new kernel and reboot into it,
without the hang.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/