Re: [SCSI][REGRESSION][BISECTED] Disk errors loop forever in 2.6.29

From: Ingo Molnar
Date: Thu Feb 19 2009 - 09:59:11 EST



* Sitsofe Wheeler <sitsofe@xxxxxxxxx> wrote:

> Hi,
>
> There appears to be a regression from 2.6.28 in how disk errors are
> handled in 2.6.29rc5 - rather than trying and eventually giving up, it
> appears to try (and report) forever.
>
> Here is the output where it aborts in 2.6.28:
>
> ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> ata2.01: BMDMA stat 0x65
> ata2.01: cmd c8/00:00:f7:e2:9c/00:00:00:00:00/f1 tag 0 dma 131072 in
> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
> ata2: soft resetting link
> ata2.00: configured for UDMA/66
> ata2.01: configured for UDMA/66
> ata2: EH complete
> ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> ata2.01: BMDMA stat 0x65
> ata2.01: cmd c8/00:00:f7:e2:9c/00:00:00:00:00/f1 tag 0 dma 131072 in
> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
> ata2: soft resetting link
> ata2.00: configured for UDMA/66
> ata2.01: configured for UDMA/66
> ata2: EH complete
> sd 1:0:0:0: [sda] 7880544 512-byte hardware sectors: (4.03 GB/3.75 GiB)
> ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> ata2.01: BMDMA stat 0x65
> ata2.01: cmd c8/00:00:f7:e2:9c/00:00:00:00:00/f1 tag 0 dma 131072 in
> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
> ata2: soft resetting link
> ata2.00: configured for UDMA/66
> ata2.01: configured for UDMA/66
> ata2: EH complete
> sd 1:0:0:0: [sda] Write Protect is off
> sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
> ata2.01: limiting speed to UDMA/44:PIO4
> ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> ata2.01: BMDMA stat 0x65
> ata2.01: cmd c8/00:00:f7:e2:9c/00:00:00:00:00/f1 tag 0 dma 131072 in
> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
> ata2: soft resetting link
> ata2.00: configured for UDMA/66
> ata2.01: configured for UDMA/44
> ata2: EH complete
> ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> ata2.01: BMDMA stat 0x65
> ata2.01: cmd c8/00:00:f7:e2:9c/00:00:00:00:00/f1 tag 0 dma 131072 in
> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
> ata2: soft resetting link
> ata2.00: configured for UDMA/66
> ata2.01: configured for UDMA/44
> ata2: EH complete
> sd 1:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
> sda: detected capacity change from 0 to 4034838528
> ata2.01: limiting speed to UDMA/33:PIO4
> ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> ata2.01: BMDMA stat 0x65
> ata2.01: cmd c8/00:00:f7:e2:9c/00:00:00:00:00/f1 tag 0 dma 131072 in
> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
> ata2: soft resetting link
> ata2.00: configured for UDMA/66
> ata2.01: configured for UDMA/33
> sd 1:0:1:0: [sdb] Result: hostbyte=0x00 driverbyte=0x08
> sd 1:0:1:0: [sdb] Sense Key : 0xb [current] [descriptor]
> Descriptor sense data with sense descriptors (in hex):
> 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
> 00 00 00 00
> sd 1:0:1:0: [sdb] ASC=0x0 ASCQ=0x0
> end_request: I/O error, dev sdb, sector 27058935
> Buffer I/O error on device sdb2, logical block 444480
> Buffer I/O error on device sdb2, logical block 444481
> Buffer I/O error on device sdb2, logical block 444482
> Buffer I/O error on device sdb2, logical block 444483
> Buffer I/O error on device sdb2, logical block 444484
> Buffer I/O error on device sdb2, logical block 444485
> Buffer I/O error on device sdb2, logical block 444486
> Buffer I/O error on device sdb2, logical block 444487
> Buffer I/O error on device sdb2, logical block 444488
> Buffer I/O error on device sdb2, logical block 444489
> ata2: EH complete
>
> It never gets to end_request on 2.6.29. I've bisected the problem down
> to the following:
>
> [b60af5b0adf0da24c673598c8d3fb4d4189a15ce] [SCSI] simplify scsi_io_completion()
>
> Author: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
> Date: Mon Nov 3 15:56:47 2008 -0500
>
> [SCSI] simplify scsi_io_completion()

i had SCSI problems with that area of the code, and the patch
below fixed it. Maybe it fixes your problem too.

Ingo

----------------------->