Re: [PATCH v6 2/3] vfio/ism: Implement vfio_pci driver for ISM devices
From: Niklas Schnelle
Date: Fri Mar 20 2026 - 08:57:48 EST
On Thu, 2026-03-19 at 16:42 +0100, Julian Ruess wrote:
> Add a vfio_pci variant driver for the s390-specific Internal Shared
> Memory (ISM) devices used for inter-VM communication.
>
> This enables the development of vfio-pci-based user space drivers for
> ISM devices.
>
> On s390, kernel primitives such as ioread() and iowrite() are switched
> over from function handle based PCI load/stores instructions to PCI
> memory-I/O (MIO) loads/stores when these are available and not
> explicitly disabled. Since these instructions cannot be used with ISM
> devices, ensure that classic function handle-based PCI instructions are
> used instead.
>
> The driver is still required even when MIO instructions are disabled, as
> the ISM device relies on the PCI store block (PCISTB) instruction to
> perform write operations.
>
> Stores are not fragmented, therefore one ioctl corresponds to exactly
> one PCISTB instruction. User space must ensure to not write more than
> 4096 bytes at once to an ISM BAR which is the maximum payload of the
> PCISTB instruction.
>
> Signed-off-by: Julian Ruess <julianr@xxxxxxxxxxxxx>
> ---
> drivers/vfio/pci/Kconfig | 2 +
> drivers/vfio/pci/Makefile | 2 +
> drivers/vfio/pci/ism/Kconfig | 10 ++
> drivers/vfio/pci/ism/Makefile | 3 +
> drivers/vfio/pci/ism/main.c | 400 ++++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 417 insertions(+)
>
As a general note, ISM devices do not support any low power/sleep modes
which is why there is no PM handling done.
> +
> +/*
> + * ism_vfio_pci_do_io_w()
> + *
> + * Ensure that the PCI store block (PCISTB) instruction is used as required by the
> + * ISM device. The ISM device also uses a 256 TiB BAR 0 for write operations,
> + * which requires a 48bit region address space (ISM_VFIO_PCI_OFFSET_SHIFT).
> + */
> +static ssize_t ism_vfio_pci_do_io_w(struct vfio_pci_core_device *vdev,
> + char __user *buf, loff_t off, size_t count,
> + int bar)
> +{
> + struct zpci_dev *zdev = to_zpci(vdev->pdev);
> + struct ism_vfio_pci_core_device *ivpcd;
> + ssize_t ret;
> + void *data;
> + u64 req;
> +
> + if (count > zdev->maxstbl)
> + return -EINVAL;
> + if (((off % PAGE_SIZE) + count) > PAGE_SIZE)
> + return -EINVAL;
> +
> + ivpcd = container_of(vdev, struct ism_vfio_pci_core_device,
> + core_device);
> + data = kmem_cache_alloc(ivpcd->store_block_cache, GFP_KERNEL);
> + if (!data)
> + return -ENOMEM;
> +
> + if (copy_from_user(data, buf, count)) {
> + ret = -EFAULT;
> + goto out_free;
> + }
> +
> + req = ZPCI_CREATE_REQ(READ_ONCE(zdev->fh), bar, count);
> + ret = __zpci_store_block(data, req, off);
Note: There is a Sashiko finding that PCI Store Block needs 8 byte
alignment and we don't ensure that off is 8 byte aligned here. While
generally true the ISM device has relaxed alignment rules and this
requirement does not apply. That said we should set PAGE_SIZE alignment
on the kmem_cache such that data is guaranteed PAGE_SIZE aligned and
the page crossing check works correctly.
> + if (ret)
> + goto out_free;
> +
> + ret = count;
> +
> +out_free:
> + kmem_cache_free(ivpcd->store_block_cache, data);
> + return ret;
> +}
> +
--- snip ---
> +
> +static int ism_vfio_pci_init_dev(struct vfio_device *core_vdev)
> +{
> + struct zpci_dev *zdev = to_zpci(to_pci_dev(core_vdev->dev));
> + struct ism_vfio_pci_core_device *ivpcd;
> + char cache_name[20];
> + int ret;
> +
> + ivpcd = container_of(core_vdev, struct ism_vfio_pci_core_device,
> + core_device.vdev);
> +
> + snprintf(cache_name, sizeof(cache_name), "ism_sb_fid_%08x", zdev->fid);
> + ivpcd->store_block_cache =
> + kmem_cache_create(cache_name, zdev->maxstbl, 0, 0, NULL);
Sashiko notes here that the cache_name is stack allocated and says that
kmem_cache_create() doesn't copy it. Looking at
__kmem_cache_create_args() however this seems to be not true as that
does krstrdup_const() on the passed-in name.
> + if (!ivpcd->store_block_cache)
> + return -ENOMEM;
> +
> + ret = vfio_pci_core_init_dev(core_vdev);
> + if (ret)
> + kmem_cache_destroy(ivpcd->store_block_cache);
> +
> + return ret;
> +}
> +
--- snip ---