IRQ affinity not working on Xen pci-platform device

From: David Woodhouse
Date: Fri Mar 03 2023 - 10:16:31 EST


I added the 'xen_no_vector_callback' kernel parameter a while back
(commit b36b0fe96af) to ensure we could test that more for Linux
guests.

Most of my testing at the time was done with just two CPUs, and I
happened to just test it with four. It fails, because the IRQ isn't
actually affine to CPU0.

I tried making it work anyway (in line with the comment in platform-
pci.c which says that it shouldn't matter if it *runs* on CPU0 as long
as it processes events *for* CPU0). That didn't seem to work.

If I put the irq_set_affinity() call *before* the request_irq() that
does actually work. But it's setting affinity on an IRQ it doesn't even
own yet.

Test hacks below; this is testable with today's QEMU master (yay!) and:

qemu-system-x86_64 -display none -serial mon:stdio -smp 4 \
-accel kvm,xen-version=0x4000a,kernel-irqchip=split \
-kernel ~/git/linux/arch/x86/boot//bzImage \
-append "console=ttyS0,115200 xen_no_vector_callback"

...

[ 0.577173] ACPI: \_SB_.LNKC: Enabled at IRQ 11
[ 0.578149] The affinity mask was 0-3
[ 0.579081] The affinity mask is 0-3 and the handler is on 2
[ 0.580288] The affinity mask is 0 and the handler is on 2



diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index c7715f8bd452..e3d159f1eb86 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1712,11 +1712,12 @@ void handle_irq_for_port(evtchn_port_t port, struct evtchn_loop_ctrl *ctrl)

static int __xen_evtchn_do_upcall(void)
{
- struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu);
+ struct vcpu_info *vcpu_info = per_cpu(xen_vcpu, 0);
int ret = vcpu_info->evtchn_upcall_pending ? IRQ_HANDLED : IRQ_NONE;
- int cpu = smp_processor_id();
+ int cpu = 0;//smp_processor_id();
struct evtchn_loop_ctrl ctrl = { 0 };

+ WARN_ON_ONCE(smp_processor_id() != 0);
read_lock(&evtchn_rwlock);

do {
diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c
index fcc819131572..647991211633 100644
--- a/drivers/xen/platform-pci.c
+++ b/drivers/xen/platform-pci.c
@@ -64,6 +64,16 @@ static uint64_t get_callback_via(struct pci_dev *pdev)

static irqreturn_t do_hvm_evtchn_intr(int irq, void *dev_id)
{
+ struct pci_dev *pdev = dev_id;
+
+ if (unlikely(smp_processor_id())) {
+ const struct cpumask *mask = irq_get_affinity_mask(pdev->irq);
+ if (mask)
+ printk("The affinity mask is %*pbl and the handler is on %d\n",
+ cpumask_pr_args(mask), smp_processor_id());
+ return IRQ_NONE;
+ }
+
return xen_hvm_evtchn_do_upcall();
}

@@ -132,6 +142,12 @@ static int platform_pci_probe(struct pci_dev *pdev,
dev_warn(&pdev->dev, "request_irq failed err=%d\n", ret);
goto out;
}
+
+ const struct cpumask *mask = irq_get_affinity_mask(pdev->irq);
+ if (mask)
+ printk("The affinity mask was %*pbl\n",
+ cpumask_pr_args(mask));
+
/*
* It doesn't strictly *have* to run on CPU0 but it sure
* as hell better process the event channel ports delivered

Attachment: smime.p7s
Description: S/MIME cryptographic signature