Re: [PATCH 1/4] drivers: hv: dxgkrnl: core code

From: Greg KH
Date: Fri Aug 28 2020 - 02:13:08 EST


On Thu, Aug 27, 2020 at 05:05:44PM -0700, Iouri Tarassov wrote:
>
> On 8/14/2020 6:04 AM, Greg KH wrote:
> > On Fri, Aug 14, 2020 at 08:38:53AM -0400, Sasha Levin wrote:
> > > Add support for a Hyper-V based vGPU implementation that exposes the
> > > DirectX API to Linux userspace.
> > > > Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
> > > ---
> > > drivers/hv/dxgkrnl/Kconfig | 10 +
> > > drivers/hv/dxgkrnl/Makefile | 12 +
> > > drivers/hv/dxgkrnl/d3dkmthk.h | 1636 ++++++++++
> > > drivers/hv/dxgkrnl/dxgadapter.c | 1406 ++++++++
> > > drivers/hv/dxgkrnl/dxgkrnl.h | 927 ++++++
> > > drivers/hv/dxgkrnl/dxgmodule.c | 656 ++++
> > > drivers/hv/dxgkrnl/dxgprocess.c | 357 ++
> > > drivers/hv/dxgkrnl/dxgvmbus.c | 3084 ++++++++++++++++++
> > > drivers/hv/dxgkrnl/dxgvmbus.h | 873 +++++
> > > drivers/hv/dxgkrnl/hmgr.c | 604 ++++
> > > drivers/hv/dxgkrnl/hmgr.h | 112 +
> > > drivers/hv/dxgkrnl/ioctl.c | 5413 +++++++++++++++++++++++++++++++
> > > drivers/hv/dxgkrnl/misc.c | 279 ++
> > > drivers/hv/dxgkrnl/misc.h | 309 ++
> > > 14 files changed, 15678 insertions(+)
> >
> > It's almost impossible to review 15k lines at once, please break this up
> > into reviewable chunks next time.
>
> Sorry about this, but we had to replace a lot of typedefs, which are not
> allowed by the coding style.

Ok, nice work, but that has nothing to do with how you submit a patch to
us for review.

> We expect one more big patch, which cannot be split in my opinion.

I disagree with that opinion, and so do thousands of other Linux kernel
developers who have done this successfully in the past :)

Remember, it is your job to make this as simple and as easy as possible
for me to review your code, such that it is trivial for me to understand
and accept it. That takes more work on your side to do this, as we have
thousands of developers, but very few reviewers. We know we waste
engineering time on this type of thing, but the end result makes for
better reviews and consequentially, better reviews.

So don't ignore this advice, remember, you are wanting me to do
something for you, for free. Make it easy for me to do so.

> The VM
> vbus message format was changed to include additional header. As the result,
> every function in dxgvmbus.c needs to be changed to handle the new header. I
> do not see how this can be split to multiple patches so each patch produces
> a working driver.

It doesn't have to "work" fully, see many many examples of how to do
this every week submitted to us. It's not an impossible task at all.

> > > +++ b/drivers/hv/dxgkrnl/Kconfig
> > > @@ -0,0 +1,10 @@
> > > +#
> > > +# dxgkrnl configuration
> > > +#
> > > +
> > > +config DXGKRNL
> > > + tristate "Microsoft virtual GPU support"
> > > + depends on HYPERV
> > > + help
> > > + This driver supports Microsoft virtual GPU.
> > > +
> >
> > You need more text here, this isn't a staging driver submission :)
> Is the the proposed description good enough?

What proposed description?

> "This driver handles paravirtualized GPU devices exposed by Microsoft
> Hyper-V when Linux is running inside of a virtual machine hosted
> by Windows."

That's better, but really, when a tiny serial port driver has more text
than this huge thing, you might want to consider expanding on exactly
what you want people to understand...

> > > +struct d3dkmt_closeadapter {
> > > + struct d3dkmthandle adapter_handle;
> > > +};
> >
> > A "handle"? And that has to be one of the most difficult structure
> > names ever :)
> >
> > Why not just use the "handle" for the structure as obviously that's all
> > that is needed here.
> The structure definition matches the Windows D3DKMT interface. Some input
> structures to the interface functions have only one member. But there is
> possibility that new member could be added in the future. We prefer to have
> matching names between Windows and Linux to avoid confusion.

Don't write code because "it might change in the future". Write code
for what you have today. If it does change in the future, wonderful, go
and change the code. You have the full ability to do so then, no need
to hurt all of us today for that potential.

As for "matching names", why does that matter? Who sees both names at
the same time?

> > > +
> > > +struct d3dkmt_openadapterfromluid {
> > > + struct winluid adapter_luid;
> > > + struct d3dkmthandle adapter_handle;
> > > +};
> > > +
> > > +struct d3dddi_allocationlist {
> > > + struct d3dkmthandle allocation;
> > > + union {
> > > + struct {
> > > + uint write_operation :1;
> > > + uint do_not_retire_instance :1;
> > > + uint offer_priority :3;
> > > + uint reserved :27;
> >
> > endian issues?
> >
> > If not, why are these bit fields?
> This matches the definition on the Windows side. Windows only works on
> little endian platforms.

But Linux works on both, so you need to properly document/handle this somehow.

> >
> > > +struct d3dkmt_destroydevice {
> > > + struct d3dkmthandle device;
> > > +};
> >
> > Again, single entity structures?
> >
> > Are you trying to pass around "handles" and cast them backwards?
> >
> > If so, great, but then use the real kernel structures for that like
> > 'struct device' if these are actually devices.
> >
> Again. The structure matches the definition on the Windows side to avoid
> confusion.

Who is confused here? We accept naming conventions that do not match
the normal Linux style when they are referring to external sources of
the data. Examples of this are USB device field names, and other
hardware specifications that are public. You aren't sharing code with a
Windows system, so please follow the Linux coding style rules, as you
want Linux developers to be helping you maintain this code, not
developers who have ever read code from other operating systems.

So please follow the rule of, "unless these fields and structures are
publically defined somewhere, use Linux naming rules", like all of the
rest of us do.

> > > + ret = dxgglobal_getiospace(dxgglobal);
> > > + if (ret) {
> > > + pr_err("getiospace failed: %d\n", ret);
> > > + goto error;
> > > + }
> > > +
> > > + ret = dxgvmb_send_set_iospace_region(dxgglobal->mmiospace_base,
> > > + dxgglobal->mmiospace_size, 0);
> > > + if (ISERROR(ret)) {
> > > + pr_err("send_set_iospace_region failed");
> > > + goto error;
> >
> > You forgot to unwind from the things you initialized above :(
> The caller of dxgglobal_init_global_channel() checks the return value and
> calls dxgglobal_destroy_global_channel() in case of an error, which does the
> cleanup. If preferred the call to destroy the channel could be moved to the
> end of this function.

It is generally a good idea for a function to clean up after itself if
things go wrong as it is almost impossible for a reader of the code, or
automated tools, to determine that these resources are freed up by an
external call later on in the code path.

So yes, please fix this up.

> > > +static void dxgglobal_destroy_global_channel(void)
> > > +{
> > > + dxglockorder_acquire(DXGLOCK_GLOBAL_CHANNEL);
> > > + down_write(&dxgglobal->channel_lock);
> > > +
> > > + TRACE_DEBUG(1, "%s", __func__);
> >
> > ftrace is your friend :)
> I mentioned in other mail that these macros will be removed when we pick to
> final tracing technology for the driver.

Please pick now, no need to wait :)

thanks,

greg k-h