Re: [git patches] libata updates for 2.6.34

From: Zeno Davatz
Date: Mon Mar 15 2010 - 09:30:45 EST


On Mon, Mar 15, 2010 at 2:21 PM, Zeno Davatz <zdavatz@xxxxxxxxx> wrote:
> On Mon, Mar 15, 2010 at 2:06 PM, Jeff Garzik <jeff@xxxxxxxxxx> wrote:
>> On 03/15/2010 03:33 AM, Zeno Davatz wrote:
>>>
>>> On Mon, Mar 15, 2010 at 3:55 AM, Jeff Garzik<jeff@xxxxxxxxxx>  wrote:
>>>>
>>>> On 03/09/2010 11:26 PM, Tejun Heo wrote:
>>>>>
>>>>> Hello, Linus, Jeff.
>>>>>
>>>>> On 03/10/2010 07:12 AM, Jeff Garzik wrote:
>>>>>>
>>>>>> Coincedentally, it looks like someone else just reported the same
>>>>>> problem, with 2.6.34-rc1.
>>>>>>
>>>>>> It definitely sounds like a race.  READ DMA is a DMA command as the
>>>>>> name
>>>>>> implies, so that eliminates the possibility of polling-related paths in
>>>>>> ata_sff_interrupt (libata-sff.c).
>>>>>>
>>>>>> I'll flip some of my machines to the icky slow boring piix mode, rather
>>>>>> than sexy AHCI mode :) to see if I can reproduce.  I have had a feeling
>>>>>> that we needed a more sophisticated IRQ handling setup, this may be
>>>>>> what
>>>>>> was needed.  Lost interrupt recovery should occur faster than 30
>>>>>> seconds
>>>>>> in any case, and should not require a hard reset if the hardware
>>>>>> functions just fine outside of the lost-interrupt / race that just
>>>>>> occurred.
>>>>>
>>>>> Yeap, there is a race condition with clearing which I don't think we
>>>>> can solve completely but with some modification I think we can at
>>>>> least cover known failure cases.
>>>>>
>>>>> For longer term, I don't think we can solve this by diddling with the
>>>>> SFF registers.  The interface is just way too ancient and horrid to
>>>>> build anything reliable on top of.  I'm planning on implementing
>>>>> smarter IRQ storm handling and stepped timeouts for ATA commands.
>>>>
>>>> A tester on this bug
>>>>        http://bugzilla.kernel.org/show_bug.cgi?id=15537
>>>> seemed to find success with the patch.
>>>
>>> Thanks for the Update!
>>>
>>> I will wait some more and then test rc-2.
>>
>> Can you test the patch, please?
>
> Sure. I done:
>
> /usr/src/linux> sudo patch -p1 < teo
> patching file drivers/ata/libata-sff.c
> Hunk #1 FAILED at 1667.
> Hunk #2 FAILED at 1700.
> Hunk #3 FAILED at 1718.
> Hunk #4 FAILED at 1770.
> Hunk #5 FAILED at 1792.
> Hunk #6 FAILED at 1801.
> Hunk #7 FAILED at 1818.
> 7 out of 7 hunks FAILED -- saving rejects to file drivers/ata/libata-sff.c.rej
>
> Is that the expected outcome? Can I ignore the "failed" output?
>
> Now I guess I still need to do "make", lilo and reboot.

Done all of the above, no getting the following right after running
dmesg after the reboot:

Mar 15 04:22:12 zenogentoo Probing IDE interface ide1...
Mar 15 04:22:12 zenogentoo ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Mar 15 04:22:12 zenogentoo ide1 at 0x170-0x177,0x376 on irq 15
Mar 15 04:22:12 zenogentoo ide_generic: please use "probe_mask=0x3f"
module parameter for probing all legacy ISA IDE ports
Mar 15 04:22:12 zenogentoo ide-gd driver 1.18
Mar 15 04:22:12 zenogentoo ide-cd driver 5.00
Mar 15 04:22:12 zenogentoo ide-cd: hda: ATAPI 48X DVD-ROM DVD-R
CD-R/RW drive, 2048kB Cache
Mar 15 04:22:12 zenogentoo Uniform CD-ROM driver Revision: 3.20
Mar 15 04:22:12 zenogentoo ata_piix 0000:00:1f.2: version 2.13
Mar 15 04:22:12 zenogentoo ata_piix 0000:00:1f.2: PCI INT B -> GSI 19
(level, low) -> IRQ 19
Mar 15 04:22:12 zenogentoo ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ]
Mar 15 04:22:12 zenogentoo ata_piix 0000:00:1f.2: setting latency timer to 64
Mar 15 04:22:12 zenogentoo scsi0 : ata_piix
Mar 15 04:22:12 zenogentoo scsi1 : ata_piix
Mar 15 04:22:12 zenogentoo ata1: SATA max UDMA/133 cmd 0xd080 ctl
0xd000 bmdma 0xc800 irq 19
Mar 15 04:22:12 zenogentoo ata2: SATA max UDMA/133 cmd 0xcc00 ctl
0xc880 bmdma 0xc808 irq 19
Mar 15 04:22:12 zenogentoo Intel(R) PRO/1000 Network Driver - version
7.3.21-k5-NAPI
Mar 15 04:22:12 zenogentoo Copyright (c) 1999-2006 Intel Corporation.
Mar 15 04:22:12 zenogentoo e1000 0000:02:06.0: PCI INT A -> GSI 18
(level, low) -> IRQ 18
Mar 15 04:22:12 zenogentoo ata1.00: ATA-7: INTEL SSDSA2MH160G1GC,
045C8820, max UDMA/133
Mar 15 04:22:12 zenogentoo ata1.00: 312581808 sectors, multi 16: LBA48
NCQ (depth 0/31)
Mar 15 04:22:12 zenogentoo ata1.00: configured for UDMA/133
Mar 15 04:22:12 zenogentoo scsi 0:0:0:0: Direct-Access ATA
INTEL SSDSA2MH16 045C PQ: 0 ANSI: 5
Mar 15 04:22:12 zenogentoo sd 0:0:0:0: Attached scsi generic sg0 type 0
Mar 15 04:22:12 zenogentoo sd 0:0:0:0: [sda] 312581808 512-byte
logical blocks: (160 GB/149 GiB)
Mar 15 04:22:12 zenogentoo sd 0:0:0:0: [sda] Write Protect is off
Mar 15 04:22:12 zenogentoo sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Mar 15 04:22:12 zenogentoo sd 0:0:0:0: [sda] Write cache: enabled,
read cache: enabled, doesn't support DPO or FUA
Mar 15 04:22:12 zenogentoo e1000: 0000:02:06.0: e1000_probe:
(PCI:33MHz:32-bit) 00:13:d3:b4:3c:6c
Mar 15 04:22:12 zenogentoo sda: sda1 sda2 sda3
Mar 15 04:22:12 zenogentoo sd 0:0:0:0: [sda] Attached SCSI disk

Let me know if I can be of any further help.

Linux zenogentoo 2.6.34-rc1 #80 SMP Mon Mar 15 14:22:23 CET 2010 i686
Intel(R) Pentium(R) 4 CPU 3.00GHz GenuineIntel GNU/Linux

If I get another hang I will let you know.

Best
Zeno
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/