Re: Kernel Crash when using the open-iscsi initiator on 2.6.25.6

From: Ashutosh Naik
Date: Wed Jun 25 2008 - 13:35:38 EST


On Wed, Jun 25, 2008 at 10:24 PM, Mike Christie <michaelc@xxxxxxxxxxx> wrote:

>> connection5:0: ping timeout of 5 secs expired, last rx 4309652882,
>> last ping 4309657882, now 4309662882
>
>
> However, once it happens we should not report it again like is done here.
> There is something weird there. Do you have the iscsid output? Between these
> two reports of pings timing out is there any messages from iscsid about
> reconnecting?

iscsid tried to reconnect but the target died, I think.

>> connection5:0: detected conn error (1011)
>> connection5:0: detected conn error (1011)
>> session5: host reset succeeded
>
>
> And we should not get here. The iscsi driver's scsi command timeout handler
> should prevent the command from firing the scsi eh, because in this case we
> think it is a transport problem.
>
> What version of the iscsi tools are you using? Are they from a distro or
> open-iscsi.org?
>
> Are you running with the iscsi kernel modules from 2.6.25.6, or are you
> using the iscsi modules from the open-iscsi.org website that come with the
> tarball?
>
> Is the kernel a unmodified 2.6.25.6 or does it have some distro patches or
> patches that you have created?

It was an unmodififed 2.6.25.6 kernel, and open-iscsi version 2.0-869.2

>> INFO: task fdisk:5226 blocked for more than 120 seconds.
>
> I think you get this message and what follows, is a result of the above
> problem. While the iscsi initiator is trying to reconnect, IO is queued by
> the scsi layer so fdisk is going to be waiting around until we recover or
> give up.

Yep, but is there any way to close gracefully and avoid the kernel dump?

Thanks
Ashutosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/