Re: remove the nvlink2 pci_vfio subdriver v2
From: Greg Kurz
Date: Tue May 04 2021 - 11:30:05 EST
On Tue, 4 May 2021 15:30:15 +0200
Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, May 04, 2021 at 03:20:34PM +0200, Greg Kurz wrote:
> > On Tue, 4 May 2021 14:59:07 +0200
> > Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > > On Tue, May 04, 2021 at 02:22:36PM +0200, Greg Kurz wrote:
> > > > On Fri, 26 Mar 2021 07:13:09 +0100
> > > > Christoph Hellwig <hch@xxxxxx> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > the nvlink2 vfio subdriver is a weird beast. It supports a hardware
> > > > > feature without any open source component - what would normally be
> > > > > the normal open source userspace that we require for kernel drivers,
> > > > > although in this particular case user space could of course be a
> > > > > kernel driver in a VM. It also happens to be a complete mess that
> > > > > does not properly bind to PCI IDs, is hacked into the vfio_pci driver
> > > > > and also pulles in over 1000 lines of code always build into powerpc
> > > > > kernels that have Power NV support enabled. Because of all these
> > > > > issues and the lack of breaking userspace when it is removed I think
> > > > > the best idea is to simply kill.
> > > > >
> > > > > Changes since v1:
> > > > > - document the removed subtypes as reserved
> > > > > - add the ACK from Greg
> > > > >
> > > > > Diffstat:
> > > > > arch/powerpc/platforms/powernv/npu-dma.c | 705 ---------------------------
> > > > > b/arch/powerpc/include/asm/opal.h | 3
> > > > > b/arch/powerpc/include/asm/pci-bridge.h | 1
> > > > > b/arch/powerpc/include/asm/pci.h | 7
> > > > > b/arch/powerpc/platforms/powernv/Makefile | 2
> > > > > b/arch/powerpc/platforms/powernv/opal-call.c | 2
> > > > > b/arch/powerpc/platforms/powernv/pci-ioda.c | 185 -------
> > > > > b/arch/powerpc/platforms/powernv/pci.c | 11
> > > > > b/arch/powerpc/platforms/powernv/pci.h | 17
> > > > > b/arch/powerpc/platforms/pseries/pci.c | 23
> > > > > b/drivers/vfio/pci/Kconfig | 6
> > > > > b/drivers/vfio/pci/Makefile | 1
> > > > > b/drivers/vfio/pci/vfio_pci.c | 18
> > > > > b/drivers/vfio/pci/vfio_pci_private.h | 14
> > > > > b/include/uapi/linux/vfio.h | 38 -
> > > >
> > > >
> > > > Hi Christoph,
> > > >
> > > > FYI, these uapi changes break build of QEMU.
> > >
> > > What uapi changes?
> > >
> >
> > All macros and structure definitions that are being removed
> > from include/uapi/linux/vfio.h by patch 1.
> >
> > > What exactly breaks?
> > >
> >
> > These macros and types are used by the current QEMU code base.
> > Next time the QEMU source tree updates its copy of the kernel
> > headers, the compilation of affected code will fail.
>
> So does QEMU use this api that is being removed, or does it just have
> some odd build artifacts of the uapi things?
>
These are region subtypes definition and associated capabilities.
QEMU basically gets information on VFIO regions from the kernel
driver and for those regions with a nvlink2 subtype, it tries
to extract some more nvlink2 related info.
> What exactly is the error messages here?
>
[55/143] Compiling C object libqemu-ppc64-softmmu.fa.p/hw_vfio_pci-quirks.c.o
FAILED: libqemu-ppc64-softmmu.fa.p/hw_vfio_pci-quirks.c.o
cc -Ilibqemu-ppc64-softmmu.fa.p -I. -I../.. -Itarget/ppc -I../../target/ppc -I../../capstone/include/capstone -Iqapi -Itrace -Iui -Iui/shader -I/usr/include/pixman-1 -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -fdiagnostics-color=auto -pipe -Wall -Winvalid-pch -Werror -std=gnu99 -O2 -g -isystem /home/greg/Work/qemu/qemu-virtiofs/linux-headers -isystem linux-headers -iquote . -iquote /home/greg/Work/qemu/qemu-virtiofs -iquote /home/greg/Work/qemu/qemu-virtiofs/include -iquote /home/greg/Work/qemu/qemu-virtiofs/disas/libvixl -iquote /home/greg/Work/qemu/qemu-virtiofs/tcg/ppc -iquote /home/greg/Work/qemu/qemu-virtiofs/accel/tcg -pthread -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -Wold-style-declaration -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels -Wexpansion-to-defined -Wimplicit-fallthrough=2 -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-psabi -fstack-protector-strong -fPIC -isystem../../linux-headers -isystemlinux-headers -DNEED_CPU_H '-DCONFIG_TARGET="ppc64-softmmu-config-target.h"' '-DCONFIG_DEVICES="ppc64-softmmu-config-devices.h"' -MD -MQ libqemu-ppc64-softmmu.fa.p/hw_vfio_pci-quirks.c.o -MF libqemu-ppc64-softmmu.fa.p/hw_vfio_pci-quirks.c.o.d -o libqemu-ppc64-softmmu.fa.p/hw_vfio_pci-quirks.c.o -c ../../hw/vfio/pci-quirks.c
../../hw/vfio/pci-quirks.c: In function ‘vfio_pci_nvidia_v100_ram_init’:
../../hw/vfio/pci-quirks.c:1597:36: error: ‘VFIO_REGION_SUBTYPE_NVIDIA_NVLINK2_RAM’ undeclared (first use in this function); did you mean ‘VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD’?
VFIO_REGION_SUBTYPE_NVIDIA_NVLINK2_RAM,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD
../../hw/vfio/pci-quirks.c:1597:36: note: each undeclared identifier is reported only once for each function it appears in
../../hw/vfio/pci-quirks.c:1603:44: error: ‘VFIO_REGION_INFO_CAP_NVLINK2_SSATGT’ undeclared (first use in this function); did you mean ‘VFIO_REGION_INFO_CAP_SPARSE_MMAP’?
hdr = vfio_get_region_info_cap(nv2reg, VFIO_REGION_INFO_CAP_NVLINK2_SSATGT);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
VFIO_REGION_INFO_CAP_SPARSE_MMAP
../../hw/vfio/pci-quirks.c:1624:49: error: dereferencing pointer to incomplete type ‘struct vfio_region_info_cap_nvlink2_ssatgt’
(void *) (uintptr_t) cap->tgt);
^~
../../hw/vfio/pci-quirks.c: In function ‘vfio_pci_nvlink2_init’:
../../hw/vfio/pci-quirks.c:1646:36: error: ‘VFIO_REGION_SUBTYPE_IBM_NVLINK2_ATSD’ undeclared (first use in this function); did you mean ‘VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD’?
VFIO_REGION_SUBTYPE_IBM_NVLINK2_ATSD,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD
../../hw/vfio/pci-quirks.c:1653:36: error: ‘VFIO_REGION_INFO_CAP_NVLINK2_SSATGT’ undeclared (first use in this function); did you mean ‘VFIO_REGION_INFO_CAP_SPARSE_MMAP’?
VFIO_REGION_INFO_CAP_NVLINK2_SSATGT);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
VFIO_REGION_INFO_CAP_SPARSE_MMAP
../../hw/vfio/pci-quirks.c:1661:36: error: ‘VFIO_REGION_INFO_CAP_NVLINK2_LNKSPD’ undeclared (first use in this function); did you mean ‘VFIO_REGION_INFO_CAP_SPARSE_MMAP’?
VFIO_REGION_INFO_CAP_NVLINK2_LNKSPD);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
VFIO_REGION_INFO_CAP_SPARSE_MMAP
../../hw/vfio/pci-quirks.c:1685:52: error: dereferencing pointer to incomplete type ‘struct vfio_region_info_cap_nvlink2_ssatgt’
(void *) (uintptr_t) captgt->tgt);
^~
../../hw/vfio/pci-quirks.c:1691:54: error: dereferencing pointer to incomplete type ‘struct vfio_region_info_cap_nvlink2_lnkspd’
(void *) (uintptr_t) capspeed->link_speed);
^~
> And if we put the uapi .h file stuff back, is that sufficient for qemu
> to work, as it should be checking at runtime what the kernel has / has
> not anyway, right?
>
Right. This will just be dead code in QEMU for newer kernels.
Anyway, as said in some other mail, it is probably time for QEMU to
start deprecating this code as well.
> thanks,
>
> greg k-h