Re: sata_mv port lockup on hotplug (kernel 2.6.38.2)

From: Mark Lord
Date: Fri Jun 10 2011 - 08:29:04 EST


On 11-05-25 05:41 AM, Tejun Heo wrote:
> Hello, sorry about the long delay.
>
> On Tue, May 17, 2011 at 04:30:20PM +0100, Bruce Stenning wrote:
>> __ata_port_freeze: ata4 port frozen
>> ata4: hard resetting link
>> sata_link_hardreset: ENTER
>> ata4: COMRESET failed (errno=-32)
>> sata_link_hardreset: EXIT, rc=-32
>> ata4: reset failed (errno=-32), retrying in 33 secs
>> __ata_port_freeze: ata4 port frozen
>> ata4: hard resetting link
>> sata_link_hardreset: ENTER
>> ata4: COMRESET failed (errno=-32)
>> sata_link_hardreset: EXIT, rc=-32
>> ata4: reset failed, giving up
>> ata_eh_recover: EXIT, rc=-32
>> ata4.00: disabled
>> ata4: EH complete
>> ata_scsi_error: EXIT
>>
>> The IRQ for that port is masked off afterwards.
>
> This is a different issue. libata EH plugs the port if reset fails
> repeatedly. This behavior was implemented to avoid causing continuous
> resets on a port in case it has flaky PHY state reporting; however, it
> seems to cause more trouble than fixing issues - ie. plugging in a
> broken device may end up plugging the port even after the offending
> device is removed until manual rescan or reboot. I've been pondering
> about changing the behavior like the following.
>
> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
> index dfb6e9d..05797fe 100644
> --- a/drivers/ata/libata-eh.c
> +++ b/drivers/ata/libata-eh.c
> @@ -2885,8 +2885,17 @@ int ata_eh_reset(struct ata_link *link, int classify,
> sata_scr_read(link, SCR_STATUS, &sstatus))
> rc = -ERESTART;
>
> - if (rc == -ERESTART || try >= max_tries)
> + if (rc == -ERESTART || try >= max_tries) {
> + /*
> + * Thaw host port even if reset failed, so that the port
> + * can be retried on the next phy event. This risks
> + * repeated EH runs but seems to be a better tradeoff than
> + * shutting down a port after a botched hotplug attempt.
> + */
> + if (ata_is_host_link(link))
> + ata_eh_thaw_port(ap);
> goto out;
> + }
>
> now = jiffies;
> if (time_before(now, deadline)) {


Tejun, did this ever go upstream and to -stable ??
I'm asking because I see the same issue with other SATA controllers,
in particular with sata_sil boards. Hot plug generally works _once_
per port, and then stops working.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/