Re: MSI broken in libata?

From: Torsten Kaiser
Date: Sat Jan 16 2010 - 16:58:48 EST


On Mon, Jan 11, 2010 at 2:39 AM, Robert Hancock <hancockrwd@xxxxxxxxx> wrote:
> On 01/10/2010 07:15 PM, Tejun Heo wrote:
>>
>> On 01/10/2010 01:33 PM, Torsten Kaiser wrote:
>>>
>>> I did try the patch from Robert Hancock in
>>> http://lkml.org/lkml/2010/1/6/417 ,but without success.
>>>
>>> if you need any more information, or have something for me to try,
>>> please just ask. I did look at the code and the documentation about
>>> enabling MSI, but did not see anything (obvious) wrong, so I don't
>>> know what to try next.
>>
>> Can you please try the attached patch?
>>
>> Thanks.
>>
>
> It'd be interesting to see if it makes a difference, but I don't think the
> patch is quite right.

As written in the other mail: No, Tejuns patch also didn't work.

> According to the datasheet, doing the MSI ack while
> the interrupt source is still pending will cause a new MSI to be sent, so if
> you do it before handling the interrupt you'll generate a spurious interrupt
> after every real one.
>
> Though, apparently my patch that did the MSI ack after the handling didn't
> help, so either that's wrong or the problem is unrelated. (I tend to suspect
> the latter, given that sata_nv is also failing in the same way.)

Reading http://www.siliconimage.com/docs/SiI-DS-0138-D.pdf a possible
cause might have been, that this MSI ACK was never needed. Page 63 of
this PDF says about 'Global Control': "If all interrupt conditions are
removed subsequent to an MSI, it is not necessary to assert this
Acknowledge; another MSI will be generated when an interrupt condition
occurs."

But I did not find anything that might explain my problem.

Looking at my lspci output I noted the following:
For the PCIe-bridges:
Capabilities: [80] Express (v1) Root Port (Slot+), MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
For the tg3 onboard network chips:
Capabilities: [d0] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 4096 bytes
For the SiI chip:
Capabilities: [70] Express (v1) Legacy Endpoint, MSI 00
DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 4096 bytes

So the maximum payload for it is bigger then that of the nVidia bridge.
As I don't have knowlegde of the PCI specs, I guess DevCap is what a
device is physically capable and DevCtl is the value that the BIOS /
kernel hat programmed into it for actual use.
If my guess is correct, then the SiI should be correctly limited to
128 bytes payload and that it should work.

BUT: Page 47 of the SiI-PDF says for 'Device Status and Control' the following:
Bit [14:12]: Max Read Request Size (R/W) – Allowable values are 000B
to 011B (128 to 1024 bytes).
Default is 010B (512 bytes).

So a MaxReadReq value of 4096 as indicated by lspci for my system
would be out of bounds.

Is is important? (Somehow it seems not: In the Not-MSI-case it is also
4096 bytes, but the system works fine...)


Can I do anything else to help debug this?

Torsten
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/