Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
From: Wei Liu
Date: Tue Mar 01 2022 - 17:23:29 EST
On Tue, Mar 01, 2022 at 09:45:41PM +0100, Greg KH wrote:
> On Tue, Mar 01, 2022 at 11:45:49AM -0800, Iouri Tarassov wrote:
> > - Create skeleton and add basic functionality for the
> > hyper-v compute device driver (dxgkrnl).
> >
> > - Register for PCI and VM bus driver notifications and
> > handle initialization of VM bus channels.
> >
> > - Connect the dxgkrnl module to the drivers/hv/ Makefile and Kconfig
> >
> > - Create a MAINTAINERS entry
> >
> > A VM bus channel is a communication interface between the hyper-v guest
> > and the host. The are two type of VM bus channels, used in the driver:
> > - the global channel
> > - per virtual compute device channel
> >
> > A PCI device is created for each virtual compute device, projected
> > by the host. The device vendor is PCI_VENDOR_ID_MICROSOFT and device
> > id is PCI_DEVICE_ID_VIRTUAL_RENDER. dxg_pci_probe_device handles
> > arrival of such devices. The PCI config space of the virtual compute
> > device has luid of the corresponding virtual compute device VM
> > bus channel. This is how the compute device adapter objects are
> > linked to VM bus channels.
> >
> > VM bus interface version is exchanged by reading/writing the PCI config
> > space of the virtual compute device.
> >
> > The IO space is used to handle CPU accessible compute device
> > allocations. Hyper-v allocates IO space for the global VM bus channel.
> >
> > Signed-off-by: Iouri Tarassov <iourit@xxxxxxxxxxxxxxxxxxx>
>
> Please work with internal developers to get reviews from them first,
> before requiring the kernel community to point out basic issues. It
> will save you a lot of time and stop us from feeling like we are having
> our time wasted.
>
> Some simple examples below that your coworkers should have caught:
>
> > --- /dev/null
> > +++ b/drivers/hv/dxgkrnl/dxgkrnl.h
> > @@ -0,0 +1,119 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +
> > +/*
> > + * Copyright (c) 2019, Microsoft Corporation.
>
> It is now 2022 :)
>
> > +void init_ioctls(void);
>
> That is a horrible global function name you just added to the kernel's
> namespace for a single driver :(
>
> > +long dxgk_unlocked_ioctl(struct file *f, unsigned int p1, unsigned long p2);
> > +
> > +static inline void guid_to_luid(guid_t *guid, struct winluid *luid)
> > +{
> > + *luid = *(struct winluid *)&guid->b[0];
>
> Why is the cast needed? Shouldn't you use real types in your
> structures?
>
> > +/*
> > + * The interface version is used to ensure that the host and the guest use the
> > + * same VM bus protocol. It needs to be incremented every time the VM bus
> > + * interface changes. DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION is
> > + * incremented each time the earlier versions of the interface are no longer
> > + * compatible with the current version.
> > + */
> > +#define DXGK_VMBUS_INTERFACE_VERSION_OLD 27
> > +#define DXGK_VMBUS_INTERFACE_VERSION 40
> > +#define DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION 16
>
> Where do these numbers come from, the hypervisor specification?
>
> > +/*
> > + * Pointer to the global device data. By design
> > + * there is a single vGPU device on the VM bus and a single /dev/dxg device
> > + * is created.
> > + */
> > +struct dxgglobal *dxgglobal;
>
> No, make this per-device, NEVER have a single device for your driver.
> The Linux driver model makes it harder to do it this way than to do it
> correctly. Do it correctly please and have no global structures like
> this.
>
This may not be as big an issue as you thought. The device discovery is
still done via the normal VMBus probing routine. For all intents and
purposes the dxgglobal structure can be broken down into per device
fields and a global structure which contains the protocol versioning
information -- my understanding is there will always be a global
structure to hold information related to the backend, regardless of how
many devices there are.
I definitely think splitting is doable, but I also understand why Iouri
does not want to do it _now_ given there is no such a model for multiple
devices yet, so anything we put into the per-device structure could be
incomplete and it requires further changing when such a model arrives
later.
Iouri, please correct me if I have the wrong mental model here.
All in all, I hope this is not going to be a deal breaker for the
acceptance of this driver.
Thanks,
Wei.