Re: Error handling in LibATA

From: John Treubig
Date: Thu Jan 12 2006 - 14:48:05 EST


Albert,

I began to try your patch and noticed that the code was different than my copy of 2.6.15 rc5, so I downloaded 2.6.15 (release) and still see that the base copy of libata-core.c is different. A good indicator was that ata_qc_complete() requires 2 parameters in the code from 2.6.15. Can you tell me where I can find the copy your working off and if I have to have any other files to support it?

Best wishes,
John Treubig
VT Miltope


From: "John Treubig" <jtreubig@xxxxxxxxxxx>
To: albertcc@xxxxxxxxxx
CC: linux-ide@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx,linux-scsi@xxxxxxxxxxxxxxx, alan@xxxxxxxxxxxxxxxxxxx,dougg@xxxxxxxxxx, dwm@xxxxxxxxxxxxx
Subject: Re: Error handling in LibATA
Date: Wed, 11 Jan 2006 09:30:02 -0600

Thanks all for your recommendations. I'm in the process of installing your patch. Per your requests I've attached copies of lspci, dmesg and two excerpts from the messages file. The messages from libata.msg are from a drive attached to the Promise adapter and ide.msg are from a drive attached to the motherboard secondary IDE. In both cases power was removed during testing.

My system is built with 2.6.15 rc5 and I will let you know my results.

Thanks again,
John


From: Albert Lee <albertcc@xxxxxxxxxx>
To: John Treubig <jtreubig@xxxxxxxxxxx>
CC: linux-ide@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, linux-scsi@xxxxxxxxxxxxxxx, Alan Cox <alan@xxxxxxxxxxxxxxxxxxx>, Douglas Gilbert <dougg@xxxxxxxxxx>, Doug Maxey <dwm@xxxxxxxxxxxxx>
Subject: Re: Error handling in LibATA
Date: Wed, 11 Jan 2006 10:07:35 +0800

John Treubig wrote:
> I've been working on a problem with Promise 20269 PATA adapter under
> LibATA that if the drive has a write error or time-out, the application
> that is accessing the drive using SG should see some sort of error. My
> first problem was my system hung. After patching the IDE-IO.C, with a
> recognized patch, I have been able to keep my system from hanging. Now
> the only problem is the application gets no notification that the drive
> has been rendered inaccessible. (Test case is to run a system with my
> app going, and then pull the power from the drive. System log shows the
> errors, but nothing gets back to the app). The app does get
> notifications if I perform the same type of test on a drive attached to
> the motherboard secondary IDE adapter, so we know the app is correctly
> implemented.

As Alan commented, not sure you are using IDE or libata?
Could you send the boot dmesg?

>
> I've traced the errors down to the fact that the errors are caught in
> libata-core.c (ata_qc_timeout). I'd like to put a call in libata-core.c
> that would cause an error to be reflected back to the application. Can
> you suggest the function or method that would do this?
>

If you are using libata, maybe the following patch can help.
It checks more bits of drv_stat, so status like 0x00 are returned as error.

Albert

========

--- linux/drivers/scsi/libata-core.c 2006-01-11 09:47:25.000000000 +0800
+++ errmask/drivers/scsi/libata-core.c 2006-01-11 09:51:09.000000000 +0800
@@ -3418,8 +3418,14 @@ static void ata_qc_timeout(struct ata_qu
printk(KERN_ERR "ata%u: command 0x%x timeout, stat 0x%x host_stat 0x%x\n",
ap->id, qc->tf.command, drv_stat, host_stat);

+ /* If drv_stat looks ok (0x50 normally), we treat this
+ * as lost interrupt and complete the qc as normal.
+ * If drv_stat looks bad (0x00, 0xff, etc), err_mask is set.
+ */
+ if (!ata_ok(drv_stat))
+ qc->err_mask |= __ac_err_mask(drv_stat);
+
/* complete taskfile transaction */
- qc->err_mask |= ac_err_mask(drv_stat);
ata_qc_complete(qc);
break;
}



<< dmesg.lst >>


<< pci.lst >>


<< ide.msg >>


<< libata.msg >>


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/