Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and loading
From: Iouri Tarassov
Date: Wed Mar 02 2022 - 20:09:28 EST
On 3/1/2022 11:53 PM, Greg KH wrote:
> On Tue, Mar 01, 2022 at 10:23:21PM +0000, Wei Liu wrote:
> > > > +struct dxgglobal *dxgglobal;
> > >
> > > No, make this per-device, NEVER have a single device for your driver.
> > > The Linux driver model makes it harder to do it this way than to do it
> > > correctly. Do it correctly please and have no global structures like
> > > this.
> > >
> >
> > This may not be as big an issue as you thought. The device discovery is
> > still done via the normal VMBus probing routine. For all intents and
> > purposes the dxgglobal structure can be broken down into per device
> > fields and a global structure which contains the protocol versioning
> > information -- my understanding is there will always be a global
> > structure to hold information related to the backend, regardless of how
> > many devices there are.
>
> Then that is wrong and needs to be fixed. Drivers should almost never
> have any global data, that is not how Linux drivers work. What happens
> when you get a second device in your system for this? Major rework
> would have to happen and the code will break. Handle that all now as it
> takes less work to make this per-device than it does to have a global
> variable.
>
> > I definitely think splitting is doable, but I also understand why Iouri
> > does not want to do it _now_ given there is no such a model for multiple
> > devices yet, so anything we put into the per-device structure could be
> > incomplete and it requires further changing when such a model arrives
> > later.
> >
> > Iouri, please correct me if I have the wrong mental model here.
> >
> > All in all, I hope this is not going to be a deal breaker for the
> > acceptance of this driver.
>
> For my reviews, yes it will be.
>
> Again, it should be easier to keep things in a per-device state than
> not as the proper lifetime rules and the like are automatically handled
> for you. If you have global data, you have to manage that all on your
> own and it is _MUCH_ harder to review that you got it correct.
Hi Greg,
I do not really see how the driver be written without the global data. Let's review the design.
Dxgkrnl acts as the aggregator of all virtual compute devices, projected by the host. It needs to do operations, which do not belong to a particular compute device. For example, cross device synchronization and resource sharing.
A PCI device device is created for each virtual compute device. Therefore, there should be a global list of objects and a mutex to synchronize access to the list.
A VMBus channel is offered by the host for each compute device. The list of the VMBus channels should be global.
A global VMBus channel is offered by the host. The channel does not belong to any particular compute device, so it must be global.
IO space is shared by all compute devices, so its parameters should be global.
Dxgkrnl needs to maintain a list of processes, which opened compute device objects. Dxgkrnl maintains private state for each process and when a process opens the /dev/dxg device, Dxgkrnl needs to find if the process state is already created by walking the global process list.
Now, where to keep this global state? It could be kept in the /dev/dxg private device structure. But this structure is not available when, for example, dxg_pci_probe_device() or dxg_probe_vmbus() is called.
Can there be multiple /dev/dxg devices? No. Because the /dev/dxg device represents the driver itself, not a particular compute device.
I am not sure what design model you have in mind when saying there should be no global data. Could you please explain keeping in mind the above requirements?
Thanks
Iouri