Intel PCI-E bridge ACPI resources and possibly related SATA unstabilityproblems on ASUS A8Js

From: Martin Drab
Date: Wed Jan 24 2007 - 22:37:55 EST


Hi,

recently I got my hands on an ASUS A8Js notebook (Core 2 Duo T7200,
Intel 945 PM PCI-E Chipset, for details see attached log). After booting
the latest 2.6.20-rc5-git3 kernel (but the same behaviour occurs also with
the 2.6.19.2, didn't try any other), some strange behaviour can be
observed.

At first the following messages appear in the log:

...
[ 40.303154] PCI: BIOS Bug: MCFG area at e0000000 is not E820-reserved
[ 40.303157] PCI: Not using MMCONFIG.

(not sure whether this is really a problem)

...
[ 40.400899] PCI: Using ACPI for IRQ routing
[ 40.400902] PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
[ 40.400911] PCI: Cannot allocate resource region 7 of bridge 0000:00:1c.3
[ 40.400914] PCI: Cannot allocate resource region 8 of bridge 0000:00:1c.3
[ 40.400917] PCI: Cannot allocate resource region 9 of bridge 0000:00:1c.3

and if I have the Kouwell 5652E SATA-II ExpressCard (with Silicon Image
SiL 3132 controller) connected to the notebook's ExpressCard slot, also
the following messages appear after the above ones:

[ 40.400999] PCI: Cannot allocate resource region 0 of device 0000:04:00.0
[ 40.401003] PCI: Cannot allocate resource region 2 of device 0000:04:00.0
[ 40.401006] PCI: Cannot allocate resource region 4 of device 0000:04:00.0
...

where

00:1c.3 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 4 (rev 02)
04:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial ATA Raid II Controller (rev 01)

Both SATA controllers (the on-board using ata_piix and the ExpressCard
using sata_sil24) are the detected

...
[ 40.639495] ata_piix 0000:00:1f.2: version 2.00ac7
[ 40.639500] ata_piix 0000:00:1f.2: MAP [ P0 P2 IDE IDE ]
[ 40.640063] ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) -> IRQ 19
[ 40.640763] PCI: Setting latency timer of device 0000:00:1f.2 to 64
[ 40.640814] ata1: SATA max UDMA/133 cmd 0x1F0 ctl 0x3F6 bmdma 0xFFA0 irq 14
[ 40.641467] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xFFA8 irq 15
[ 40.642114] scsi0 : ata_piix
[ 40.853499] ata1.00: ATA-7, max UDMA/100, 234441648 sectors: LBA48 NCQ (depth 0/32)
[ 40.854188] ata1.00: ata1: dev 0 multi count 16
[ 40.860490] ata1.00: configured for UDMA/100
[ 40.860881] scsi1 : ata_piix
[ 41.025049] ATA: abnormal status 0x7F on port 0x177
[ 41.045585] ATA: abnormal status 0x7F on port 0x177
[ 41.231326] ata2.01: ATAPI, max UDMA/33
[ 41.417213] ata2.01: configured for UDMA/33
[ 41.426449] scsi 0:0:0:0: Direct-Access ATA FUJITSU MHV2120B 0000 PQ: 0 ANSI: 5
[ 41.436554] SCSI device sda: 234441648 512-byte hdwr sectors (120034 MB)
[ 41.447026] sda: Write Protect is off
[ 41.457669] sda: Mode Sense: 00 3a 00 00
[ 41.457693] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 41.469438] SCSI device sda: 234441648 512-byte hdwr sectors (120034 MB)
[ 41.481481] sda: Write Protect is off
[ 41.493694] sda: Mode Sense: 00 3a 00 00
[ 41.493708] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 41.507017] sda: sda1 sda2 sda3 sda4
[ 41.604667] sd 0:0:0:0: Attached scsi disk sda
[ 41.622803] scsi 1:0:1:0: CD-ROM TSSTcorp CD/DVDW TS-L632D AS05 PQ: 0 ANSI: 5
...
[ 48.817403] sata_sil24 0000:04:00.0: version 0.3
[ 48.817432] ACPI: PCI Interrupt 0000:04:00.0[A] -> GSI 19 (level, low) -> IRQ 19
[ 48.817475] PCI: Setting latency timer of device 0000:04:00.0 to 64
[ 48.817531] ata3: SATA max UDMA/100 cmd 0xFFFFC20000040000 ctl 0x0 bmdma 0x0 irq 19
[ 48.817585] ata4: SATA max UDMA/100 cmd 0xFFFFC20000042000 ctl 0x0 bmdma 0x0 irq 19
[ 48.817593] scsi2 : sata_sil24
[ 49.068344] Bluetooth: HCI USB driver ver 2.9
[ 49.142279] usbcore: registered new interface driver hci_usb
[ 49.222098] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 49.222610] ata3.00: ATA-7, max UDMA/133, 156312576 sectors: LBA
[ 49.222612] ata3.00: ata3: dev 0 multi count 0
[ 49.223246] ata3.00: configured for UDMA/100
[ 49.223260] scsi3 : sata_sil24
[ 49.525900] ata4: SATA link down (SStatus 0 SControl 300)
[ 49.526898] scsi 2:0:0:0: Direct-Access ATA Maxtor 6Y080M0 YAR5 PQ: 0 ANSI: 5
[ 49.527093] SCSI device sdb: 156312576 512-byte hdwr sectors (80032 MB)
[ 49.527103] sdb: Write Protect is off
[ 49.527105] sdb: Mode Sense: 00 3a 00 00
[ 49.527269] SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 49.527309] SCSI device sdb: 156312576 512-byte hdwr sectors (80032 MB)
[ 49.527317] sdb: Write Protect is off
[ 49.527319] sdb: Mode Sense: 00 3a 00 00
[ 49.527337] SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 49.527341] sdb:<5>sd 0:0:0:0: Attached scsi generic sg0 type 0
[ 49.536071] scsi 1:0:1:0: Attached scsi generic sg1 type 5
[ 49.552648] sdb1
[ 49.552700] sd 2:0:0:0: Attached scsi disk sdb
[ 49.552733] sd 2:0:0:0: Attached scsi generic sg2 type 0
...

However, strange messages simillar to the following appear periodically
(possibly a result of the system accessing the disk, but not always, when
accessed) for both controllers

...
[ 193.907771] ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[ 193.927526] ata2.01: cmd a0/01:00:00:00:00/00:00:00:00:00/b0 tag 0 cdb 0x25 data 8 in
[ 193.927529] res 40/00:03:00:00:00/00:00:00:00:00/b0 Emask 0x4 (timeout)
[ 201.007291] ata2: port is slow to respond, please be patient (Status 0xd0)
[ 223.974196] ata2: port failed to respond (30 secs, Status 0xd0)
[ 224.000456] ata2: soft resetting port
[ 224.163274] ATA: abnormal status 0x7F on port 0x177
[ 224.568225] ata2.01: configured for UDMA/33
[ 224.568237] ata2: EH complete
[ 231.828794] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
[ 231.849673] ata3.00: (irq_stat 0x00020002, device error via D2H FIS)
[ 231.871059] ata3.00: cmd c8/00:00:08:00:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 131072 in
[ 231.871061] res 51/84:00:08:00:00/00:00:00:00:00/e0 Emask 0x10 (ATA bus error)
[ 232.217493] ata3: soft resetting port
[ 232.318609] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 232.319782] ata3.00: configured for UDMA/100
[ 232.319792] ata3: EH complete
[ 232.340315] SCSI device sdb: 156312576 512-byte hdwr sectors (80032 MB)
[ 232.363118] sdb: Write Protect is off
[ 232.385795] sdb: Mode Sense: 00 3a 00 00
[ 232.387452] SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 232.470053] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
[ 232.492477] ata3.00: (irq_stat 0x00020002, device error via D2H FIS)
[ 232.515254] ata3.00: cmd c8/00:00:f8:16:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 131072 in
[ 232.515256] res 51/84:00:f8:16:00/00:00:00:00:00/e0 Emask 0x10 (ATA bus error)
[ 232.864801] ata3: soft resetting port
[ 232.965875] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 232.967043] ata3.00: configured for UDMA/100
[ 232.967055] ata3: EH complete
[ 232.988447] SCSI device sdb: 156312576 512-byte hdwr sectors (80032 MB)
[ 233.012277] sdb: Write Protect is off
[ 233.035602] sdb: Mode Sense: 00 3a 00 00
[ 233.037536] SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
...

Until after some seconds of accessing the disk on the external controller,
the whole disk is shut down:

...
[ 269.026999] ata3.00: limiting speed to UDMA/33
[ 269.058614] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 269.090354] ata3.00: (irq_stat 0x00020002, failed to transmit command FIS)
[ 269.122048] ata3.00: cmd c8/00:08:4f:00:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 4096 in
[ 269.122051] res c8/00:08:4f:00:00/00:00:00:00:00/e0 Emask 0x12 (ATA bus error)
[ 269.184814] ata3: hard resetting port
[ 271.414531] ata3: softreset failed (port not ready)
[ 271.445584] ata3: follow-up softreset failed, retrying in 5 secs
[ 276.474431] ata3: hard resetting port
[ 278.805006] ata3: softreset failed (port not ready)
[ 278.835656] ata3: follow-up softreset failed, retrying in 5 secs
[ 283.864906] ata3: hard resetting port
[ 286.094543] ata3: softreset failed (port not ready)
[ 286.124723] ata3: reset failed, giving up
[ 286.154334] ata3.00: disabled
[ 286.183738] ata3: EH pending after completion, repeating EH (cnt=4)
[ 286.294424] ata3: exception Emask 0x10 SAct 0x0 SErr 0x4050000 action 0x6 frozen
[ 286.324117] ata3: (irq_stat 0x00060002, failed to transmit command FIS)
[ 286.874068] ata3: waiting for device to spin up (8 secs)
[ 294.289524] ata3: hard resetting port
[ 296.519160] ata3: softreset failed (port not ready)
[ 296.519741] ata3: follow-up softreset failed, retrying in 5 secs
[ 301.518100] ata3: hard resetting port
[ 303.748733] ata3: softreset failed (port not ready)
[ 303.748738] ata3: follow-up softreset failed, retrying in 5 secs
[ 308.746700] ata3: hard resetting port
[ 311.077274] ata3: softreset failed (port not ready)
[ 311.077277] ata3: reset failed, giving up
[ 311.077284] sd 2:0:0:0: SCSI error: return code = 0x08000002
[ 311.077286] sdb: Current [descriptor]: sense key: Aborted Command
[ 311.077289] Additional sense: Scsi parity error
[ 311.077294] Descriptor sense data with sense descriptors (in hex):
[ 311.077296] 72 0b 47 00 00 00 00 0c 00 0a 80 00 00 00 00 00
[ 311.077302] 00 00 00 4f
[ 311.077304] end_request: I/O error, dev sdb, sector 79
[ 311.077314] ata3: EH complete
[ 311.077320] ata3.00: detaching (SCSI 2:0:0:0)
[ 311.077323] EXT3-fs: can't read group descriptor 1
[ 311.174529] Synchronizing SCSI cache for disk sdb:
[ 311.196553] FAILED
[ 311.196555] status = 0, message = 00, host = 4, driver = 00
[ 311.196556] <3>ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
...

I didn't manage to shut down the disk on the primary ata_piix controller
that easily, but I guess when I'd try long and hard enough, it may suffer
the same fate (or maybe not?).

All of this is clearly reproducible. I do not know whether this behaviour
is somehow related to the ACPI detection problems stated at the beginning
of the mail, or whether these are two different bugs. That's why I haven't
reported it to bugzilla yet, because I do not know which section this or
these (?) should belong to. But I will do so, if anyone of you shall
kindly give me some clue for this. Also if you need any more info or
testing, just ask.

The full system logs (and configuration) are attached.

Martin

Attachment: system-logs.tar.bz2
Description: full kernel logs