Hangs seen with the 3ware controller and the 2.4.17 kernel ...

From: Manish Lachwani (manish@Zambeel.com)
Date: Tue Nov 12 2002 - 02:54:27 EST


Hello,

I am using a 2.4.17 SMP kernel and .018 version of the 3ware driver. This
happens when we have two controllers (8-port and 4-port), IO is going on
with both the controllers and on one controller (4-port in my experiment),
there is command timeout and the reset sequence fails. This is a hard
kernel hang. The last message on the window is "reset sequence failed"

kdb for the eh_1 shows:

scsi_error_handler -> scsi_unjam_host -> scsi_try_host_reset -> schedule

I do know that the scsi_try_host_reset(..) calls scsi_sleep for 10*HZ.

Anyway, another scenario that causes a hang:

scsi_error_handler -> scsi_unjam_host -> scsi_try_to_abort_command ->
schedule

Also, This hang seems to occur when there are two controllers only. When I
tried with
one controller numerous times, I could not reproduce this problem. Is it
possible that scsi_unjam_host is getting confused with two devices and when
reset fails on one host?

Any help is appreciated ...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Nov 15 2002 - 22:00:24 EST