Re: [PATCH v2 2/2] s390/pci: Migrate s390 IRQ logic to IRQ domain API

From: Niklas Schnelle
Date: Mon Nov 17 2025 - 12:29:11 EST


On Mon, 2025-11-17 at 09:59 +0100, Tobias Schumacher wrote:
> s390 is one of the last architectures using the legacy API for setup and
> teardown of PCI MSI IRQs. Migrate the s390 IRQ allocation and teardown
> to the MSI parent domain API. For details, see:
>
> https://lore.kernel.org/lkml/20221111120501.026511281@xxxxxxxxxxxxx
>
> In detail, create an MSI parent domain for zpci which is used by
> all PCI devices. When a PCI device sets up MSI or MSI-X IRQs, the
> library creates a per-device IRQ domain for this device, which is
> used by the device for allocating and freeing IRQs.
>
> The per-device domain delegates this allocation and freeing to the
> parent-domain. In the end, the corresponding callbacks of the parent
> domain are responsible for allocating and freeing the IRQs.
>
> The allocation is split into two parts:
> - zpci_msi_prepare() is called once for each device and allocates the
> required resources. On s390, each PCI function has its own airq
> vector and a summary bit, which must be configured once per function.
> This is done in prepare().
> - zpci_msi_alloc() can be called multiple times for allocating one or
> more MSI/MSI-X IRQs. This creates a mapping between the virtual IRQ
> number in the kernel and the hardware IRQ number.
>
> Freeing is split into two counterparts:
> - zpci_msi_free() reverts the effects of zpci_msi_alloc() and
> - zpci_msi_teardown() reverts the effects of zpci_msi_prepare(). This is
> callend once when all IRQs are freed before a device is removed.
>
> Since the parent domain in the end allocates the IRQs, the hwirq
> encoding must be unambiguous for all IRQs of all devices. This is
> achieved by encoding the hwirq using the PCI function id and the MSI
> index.
>
> Signed-off-by: Tobias Schumacher <ts@xxxxxxxxxxxxx>
> ---
> arch/s390/Kconfig | 1 +
> arch/s390/include/asm/pci.h | 1 +
> arch/s390/pci/pci_bus.c | 1 +
> arch/s390/pci/pci_irq.c | 335 +++++++++++++++++++++++++++-----------------
> 4 files changed, 208 insertions(+), 130 deletions(-)
>
--- snip ---
> +
> +static int zpci_msi_prepare(struct irq_domain *domain,
> + struct device *dev, int nvec,
> + msi_alloc_info_t *info)
> +{
> + struct zpci_dev *zdev = to_zpci_dev(dev);
> + struct pci_dev *pdev = to_pci_dev(dev);
> + unsigned long bit;
> + int msi_vecs, rc;
>
> msi_vecs = min_t(unsigned int, nvec, zdev->max_msi);
> - if (msi_vecs < nvec) {
> - pr_info("%s requested %d irqs, allocate system limit of %d",
> + if (msi_vecs < nvec)
> + pr_info("%s requested %d IRQs, allocate system limit of %d",
> pci_name(pdev), nvec, zdev->max_msi);

This is already wrong in the existing code but the above pr_info()
misses a "\n" at the end.

> - }
>
--- snip ---
> +static int zpci_msi_domain_alloc(struct irq_domain *domain, unsigned int virq,
> + unsigned int nr_irqs, void *args)
> +{
> + struct msi_desc *desc = ((msi_alloc_info_t *)args)->desc;
> + struct zpci_dev *zdev = to_zpci_dev(desc->dev);
> + irq_hw_number_t hwirq;
> + unsigned long bit;
> + unsigned int cpu;
> + int i;
> +
> + bit = zdev->msi_first_bit + desc->msi_index;
> + hwirq = zpci_encode_hwirq(zdev->fid, desc->msi_index);
> +
> + if (desc->msi_index + nr_irqs > zdev->max_msi)
> + return -EINVAL;
> +
> + for (i = 0; i < nr_irqs; i++) {
> + irq_domain_set_info(domain, virq + i, hwirq + i,
> + &zpci_irq_chip, zdev,
> + handle_percpu_irq, NULL, NULL);
> +
> + if (irq_delivery == DIRECTED) {
> + for_each_possible_cpu(cpu) {
> + airq_iv_set_ptr(zpci_ibv[cpu],
> + bit + i, hwirq + i);
> + }
> +

The above closing brace seems to be indented wrong. I have no idea why
checkpatch.pl --strict doesn't catch this (I tried). It also doesn't
complain when I remove one tab so let's do that. While at it also drop
the empty line here.

> + } else {
> + airq_iv_set_ptr(zdev->aibv, bit + i, hwirq + i);
> }
> - msi->msg.address_lo = 0;
> - msi->msg.address_hi = 0;
> - msi->msg.data = 0;
> - msi->irq = 0;
> }
>
> - if (zdev->aisb != -1UL) {
> - zpci_ibv[zdev->aisb] = NULL;
> - airq_iv_free_bit(zpci_sbv, zdev->aisb);
> - zdev->aisb = -1UL;
> - }
> - if (zdev->aibv) {
> - airq_iv_release(zdev->aibv);
> - zdev->aibv = NULL;
> - }
> + return 0;
> +}
--- snip ---

Apart from the two style issues this now works well with directed IRQs
and overall is a nice cleanup. Thanks a lot!

Reviewed-by: Niklas Schnelle <schnelle@xxxxxxxxxxxxx>