Re: sata_svw data corruption, strange problems

From: Pavel Machek
Date: Mon Jun 23 2008 - 04:20:20 EST


Hi!

> > I see strange problems on machine with sata_svw. The machine seems to
> > corrupt data every few days (ext3 error, dir index corrupted), and has
> > some other very strange problems (keyboard misbehaves, pulling out
> > SATA disk cures it, see
> > https://bugzilla.novell.com/show_bug.cgi?id=400772 ).
> >
> > Then I got to the comment
> >
> > writeb(dmactl | ATA_DMA_START, mmio + ATA_DMA_CMD);
> > /* There is a race condition in certain SATA controllers
> > that can be seen when the r/w command is given to the controller
> > before the host DMA is started. On a Read command, the controller
...
> > ...and that would certainly explain what we are seeing. Are
> > serverworks controllers broken by design?
>
> The comment looks like a warning to me as the DMA engine is started
> before the command is issued to the drive as explained in the next
> comment.

Ok, what about this?

---

Clarify data corruption comment.

Signed-off-by: Pavel Machek <pavel@xxxxxxx>

---
commit a362f8903eb0cdbc2ea06e0e249c97f1a64c7e1e
tree 1bfcbf9ad1b55811b71cdeb1868a41cf6b058c5d
parent 91e95912b1b48a279d0231b5c21b82388ade249e
author Pavel <pavel@xxxxxxxxxx> Mon, 23 Jun 2008 10:12:47 +0200
committer Pavel <pavel@xxxxxxxxxx> Mon, 23 Jun 2008 10:12:47 +0200

drivers/ata/sata_svw.c | 38 +++++++++++++++++++++++---------------
1 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/drivers/ata/sata_svw.c b/drivers/ata/sata_svw.c
index 16aa683..fb13b82 100644
--- a/drivers/ata/sata_svw.c
+++ b/drivers/ata/sata_svw.c
@@ -253,21 +253,29 @@ static void k2_bmdma_start_mmio(struct a
/* start host DMA transaction */
dmactl = readb(mmio + ATA_DMA_CMD);
writeb(dmactl | ATA_DMA_START, mmio + ATA_DMA_CMD);
- /* There is a race condition in certain SATA controllers that can
- be seen when the r/w command is given to the controller before the
- host DMA is started. On a Read command, the controller would initiate
- the command to the drive even before it sees the DMA start. When there
- are very fast drives connected to the controller, or when the data request
- hits in the drive cache, there is the possibility that the drive returns a part
- or all of the requested data to the controller before the DMA start is issued.
- In this case, the controller would become confused as to what to do with the data.
- In the worst case when all the data is returned back to the controller, the
- controller could hang. In other cases it could return partial data returning
- in data corruption. This problem has been seen in PPC systems and can also appear
- on an system with very fast disks, where the SATA controller is sitting behind a
- number of bridges, and hence there is significant latency between the r/w command
- and the start command. */
- /* issue r/w command if the access is to ATA*/
+ /* This works around possible data corruption.
+
+ On certain SATA controllers that can be seen when the r/w
+ command is given to the controller before the host DMA is
+ started.
+
+ On a Read command, the controller would initiate the
+ command to the drive even before it sees the DMA
+ start. When there are very fast drives connected to the
+ controller, or when the data request hits in the drive
+ cache, there is the possibility that the drive returns a
+ part or all of the requested data to the controller before
+ the DMA start is issued. In this case, the controller
+ would become confused as to what to do with the data. In
+ the worst case when all the data is returned back to the
+ controller, the controller could hang. In other cases it
+ could return partial data returning in data
+ corruption. This problem has been seen in PPC systems and
+ can also appear on an system with very fast disks, where
+ the SATA controller is sitting behind a number of bridges,
+ and hence there is significant latency between the r/w
+ command and the start command. */
+ /* issue r/w command if the access is to ATA */
if (qc->tf.protocol == ATA_PROT_DMA)
ap->ops->sff_exec_command(ap, &qc->tf);
}


--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/