Re: 3.19-rc4: Xen pci-passthrough regression, bisected to commit cffe0a2b5a34c95a4dadc9ec7132690a5b0f6687 "x86, irq: Keep balance of IOAPIC pin reference count"

From: Jiang Liu
Date: Thu Jan 15 2015 - 06:20:16 EST


Hi Sander,
It really cost me some time to understand HVM, PVH, Dom0, PV and
read Xen interrupt related code:( Now I have basic understanding of
related staffs. The patch for previous issue is actually wrong and
I'm working on another fixes for it. I will handle this issue once
getting done with the previous issue.
Sorry for the delay.
Regards!
Gerry

On 2015/1/15 0:17, Sander Eikelenboom wrote:
>
> Wednesday, January 14, 2015, 3:58:33 PM, you wrote:
>
>> On 14/01/15 14:15, Sander Eikelenboom wrote:
>>> Hi Gerry / David / Konrad,
>>>
>>> Some more testing uncovered another issue under Xen, this time with PCI-passthrough.
>
>> What device? In particular what interrupts is it using?
>
> Hi David,
>
> Here is a more complete set of debug logs, for both with and without the revert.
> - dmesg
> - xl-dmesg with output of debug keys 'i, M, z'
> - lspci part of the two devices from the guest
> - /proc/interrupts
>
> The wifi NIC (dom0: 02:00.0 guest: 00:05.0) uses legacy interrupts and gives troubles:
> It's using:
> Interrupt: pin A routed to IRQ 36
> Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit-
> 36: 14413 xen-pirq-ioapic-level ath9k
>
> The other NIC (dom0: 00:19.0 guest: 00:06.0) uses MSI interrupts and that works fine:
> Interrupt: pin A routed to IRQ 57
> Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> 57: 182 xen-pirq-msi eth0
> --
> Sander
>
>>> I have bisected it to the following commit:
>>> cffe0a2b5a34c95a4dadc9ec7132690a5b0f6687 "x86, irq: Keep balance of IOAPIC pin reference count"
>>>
>>> It causes these symptoms:
>>>
>>> - On Intel
>>> - Running on Xen with pci devices seized on host boot with xen-pciback.hide= parameter
>>> - Running a HVM guest with PCI passthrough of two devices (NIC + wireless NIC)
>>> - While the driver loads fine, the device isn't working properly, looking in /proc/interrupts in the guest
>>> shows that it doesn't receive any interrupts.
>>> - Reverting this particular commit (in the dom0 kernel only) makes the device receive interrupts and work properly again.
>>>
>>> - On AMD (more subtle symptom)
>>> - Running on Xen with pci devices seized on host boot with xen-pciback.hide= parameter
>>> - Running a HVM guest with PCI passthrough of one devices (videograbber)
>>> - While the driver loads fine and the device looks like it's working, the videostream isn't stable and it skips or repeats frames.
>>> - Reverting this particular commit (in the dom0 kernel only) makes the device work properly again with a stable videostream.
>>>
>>> --
>>> Sander
>>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/