Re: [PATCH RFC 4/4] xen/pvhvm: Make MSI IRQs work after kexec

From: Vitaly Kuznetsov
Date: Wed Jul 16 2014 - 05:02:16 EST


Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> writes:

> On Tue, Jul 15, 2014 at 03:40:40PM +0200, Vitaly Kuznetsov wrote:
>> When kexec was peformed MSI IRQs for passthrough-ed devices were already
>> mapped and we see non-zero pirq extracted from MSI msg. xen_irq_from_pirq()
>> fails as we have no IRQ mapping information for that. Requesting for new
>> mapping with __write_msi_msg() does not result in MSI IRQ being remapped so
>> we don't recieve these IRQs.
>
> receive
>

Thanks for your comments!

> How come '__write_msi_msg' does not result in new MSI IRQs?
>

Actually that was the hidden question in my RFC :-)

Let me describe what I see. When normal boot is performed we have the
following in xen_hvm_setup_msi_irqs():

__read_msi_msg()
pirq -> 0

then we allocate new pirq with
pirq = xen_allocate_pirq_msi()
pirq -> 54

and we have the following mapping:
xen: msi --> pirq=54 --> irq=72

in 'xl debug-keys i':
(XEN) IRQ: 29 affinity:04 vec:b9 type=PCI-MSI status=00000030 in-flight=0 domain-list=7: 54(----),

After kexec we see the following:
__read_msi_msg()
pirq -> 54

but as xen_irq_from_pirq() fails we follow the same path allocating new pirq:
pirq = xen_allocate_pirq_msi()
pirq -> 55

and we have the following mapping:
xen: msi --> pirq=55 --> irq=75

However (afaict) mapping in xen wasn't updated:

in 'xl debug-keys i':
(XEN) IRQ: 29 affinity:02 vec:b9 type=PCI-MSI status=00000030 in-flight=0 domain-list=7: 54(--M-),

> Is it fair to state that your code ends up reading the MSI IRQ (PIRQ)
> from the device and updating the internal PIRQ<->IRQ code to match
> with the reality?
>

Yea, 'always trust the device'.

>>
>> RFC: I wasn't able to understand why commit af42b8d1 which introduced
>> xen_irq_from_pirq() check in xen_hvm_setup_msi_irqs() is checking that instead
>> of checking pirq > 0 as if the mapping was already done (and we have pirq>0 here)
>> we don't need to request for a new pirq. We're loosing existing PIRQ and I'm also
>> not sure when __write_msi_msg() with new PIRQ will result in new mapping.
>
> We don't request a new pirq. We end up returning before we call xen_allocate_pirq_msi.
> At least that is how the commit you mentioned worked.
>

I meant to say that in case we have pirq > 0 from __read_msi_msg() but
xen_irq_from_pirq(pirq) fails (kexec-only case?) we always do
xen_allocate_pirq_msi() which brings us new pirq.

> In regards to why using 'xen_irq_from_pirq' instead of just checking the PIRQ - is
> that we might be called twice by a buggy driver. As such we want to check
> our PIRQ<->IRQ to figure this out.

But if we're called twice we'll see the same pirq, right? Or there are
some cases when we see 'crap' instead of pirq here?

I think it would be nice to use the same pirq after kexec instead of
allocating a new one even in case we can make remapping work.

Thanks for your comments again!

>>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
>> ---
>> arch/x86/pci/xen.c | 3 +--
>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
>> index 905956f..685e8f1 100644
>> --- a/arch/x86/pci/xen.c
>> +++ b/arch/x86/pci/xen.c
>> @@ -231,8 +231,7 @@ static int xen_hvm_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
>> __read_msi_msg(msidesc, &msg);
>> pirq = MSI_ADDR_EXT_DEST_ID(msg.address_hi) |
>> ((msg.address_lo >> MSI_ADDR_DEST_ID_SHIFT) & 0xff);
>> - if (msg.data != XEN_PIRQ_MSI_DATA ||
>> - xen_irq_from_pirq(pirq) < 0) {
>> + if (msg.data != XEN_PIRQ_MSI_DATA || pirq <= 0) {
>> pirq = xen_allocate_pirq_msi(dev, msidesc);
>> if (pirq < 0) {
>> irq = -ENODEV;
>> --
>> 1.9.3
>>

--
Vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/