Re: [PATCH v2 8/8] x86/ioapic: Generate RTE directly from parent irqchip's MSI message

From: David Woodhouse
Date: Sat Oct 24 2020 - 04:26:27 EST


On Fri, 2020-10-23 at 23:28 +0200, Thomas Gleixner wrote:
> On Fri, Oct 23 2020 at 11:10, David Woodhouse wrote:
> > On 22 October 2020 22:43:52 BST, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> > It makes the callers slightly more readable, not having to cast to uint32_t* from the struct.
> >
> > I did ponder defining a new struct with bitfields named along the
> > lines of 'msi_addr_bits_19_to_4', but that seemed like overkill.
>
> I did something like this in the meantime, because all of this just
> sucks.
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/apic
>
> Hot of the press and completely untested.

Hm, your struct IO_APIC_route_entry isn't actually a union; you've
defined a 128-bit structure with the IR fields *following* the non-IR
fields. But there *is* a union in io_apic.c, of that 128-bit structure
and two uint32_ts. Suspect that wasn't quite what you intended. I'll
prod at it this morning and turn it into a single union of the three,
and give it some testing.

Also, my "Move MSI support into hpet.c" patch¹ got updated to
s/CONFIG_PCI_MSI/CONFIG_GENERIC_MSI_IRQ/ at about line 53 for the MSI-
related variable declarations, which was going to be in the next
version I posted.

I was also hoping Paolo was going to take the patch which just defines
the KVM_FEATURE_MSI_EXT_DEST_ID bit² ASAP, so that we end up with a
second patch³ that *just* wires it up to x86_init.msi_ext_dest_id() for
KVM.

¹ https://git.infradead.org/users/dwmw2/linux.git/commitdiff/734719c1f4
² https://git.infradead.org/users/dwmw2/linux.git/commitdiff/3f371d6749
³ https://git.infradead.org/users/dwmw2/linux.git/commitdiff/8399e14eb5

> Yes, we can't avoid the bit swizzling at all. But it can be made more
> readable.

Hm, I was about to concede that your version is a bit more readable.

But then I got to your new __ipi_msi_compose_msg() and realised that it
isn't working because it's setting the 0xFEE base address in the _low_
bits, somehow...

msg->arch_addr_lo.base_address = X86_MSI_BASE_ADDRESS_LOW;
printk("1 Compose MSI message %x/%x\n", msg->address_lo, msg->data);
msg->arch_addr_lo.dest_mode_logical = apic->dest_mode_logical;
printk("2 Compose MSI message %x/%x\n", msg->address_lo, msg->data);
msg->arch_addr_lo.destid_0_7 = cfg->dest_apicid & 0xFF;
printk("3 Compose MSI message %x/%x\n", msg->address_lo, msg->data);

[ 1.793874] 1 Compose MSI message fee/0
[ 1.794310] 2 Compose MSI message fee/0
[ 1.794768] 3 Compose MSI message f02/0

And now I wish it was just a simple shift instead of an unholy maze of
overlapping unions of bitfields. But I'll make more coffee and stare at
it harder...

> Yes, that code is horrid, but adding a comment to that effect when
> changing it is not asked too much, right?

Sure. I just actually hadn't noticed that setting the dest/vector bits
right there was entirely redundant in the first place.

> I'm still wrapping my head around getting rid of this thing completely
> because now it's just a subset of your KVM case with the only
> restriction that I/O-APIC cannot be affined to any CPU with a APIC id
> greater than 255.

It was only ever that restriction anyway, wasn't it? Hyper-V PCI has
its own MSI handling, and there's no HPET so it was only ever the
I/OAPIC which was problematic there.

There are Hyper-V VM sizes with 416 vCPUs which depend on this today,
and which don't have the 15-bit MSI extension. Removing hyperv-iommu
would prevent us from using all the vCPUs on those. You *could* make
hyperv-iommu decline to initialise if x86_init.msi_ext_dest_id()
returns true though⁴.

https://git.infradead.org/users/dwmw2/linux.git/commitdiff/633ccf0d42

Attachment: smime.p7s
Description: S/MIME cryptographic signature