Re: linux-next: build failure after merge of the driver-core tree

From: Daniel Vetter
Date: Tue Apr 11 2023 - 14:21:34 EST


On Tue, Apr 11, 2023 at 11:18:29AM -0600, Jeffrey Hugo wrote:
> On 4/11/2023 10:31 AM, Daniel Vetter wrote:
> > On Tue, Apr 11, 2023 at 09:29:27AM -0600, Jeffrey Hugo wrote:
> > > On 4/11/2023 9:26 AM, Jeffrey Hugo wrote:
> > > > On 4/11/2023 9:13 AM, Greg KH wrote:
> > > > > On Tue, Apr 11, 2023 at 09:08:39AM -0600, Jeffrey Hugo wrote:
> > > > > > On 4/11/2023 9:01 AM, Daniel Vetter wrote:
> > > > > > > On Tue, Apr 11, 2023 at 12:40:28PM +0200, Greg KH wrote:
> > > > > > > > On Tue, Apr 11, 2023 at 11:55:20AM +0200, Daniel Vetter wrote:
> > > > > > > > > On Tue, Apr 11, 2023 at 02:38:12PM +1000, Stephen Rothwell wrote:
> > > > > > > > > > Hi all,
> > > > > > > > > >
> > > > > > > > > > After merging the driver-core tree, today's linux-next build (x86_64
> > > > > > > > > > allmodconfig) failed like this:
> > > > > > > > > >
> > > > > > > > > > In file included from include/linux/linkage.h:7,
> > > > > > > > > >                    from include/linux/kernel.h:17,
> > > > > > > > > >                    from drivers/accel/qaic/mhi_qaic_ctrl.c:4:
> > > > > > > > > > drivers/accel/qaic/mhi_qaic_ctrl.c: In function
> > > > > > > > > > 'mhi_qaic_ctrl_init':
> > > > > > > > > > include/linux/export.h:27:22: error: passing
> > > > > > > > > > argument 1 of 'class_create' from incompatible
> > > > > > > > > > pointer type
> > > > > > > > > > [-Werror=incompatible-pointer-types]
> > > > > > > > > >      27 | #define THIS_MODULE (&__this_module)
> > > > > > > > > >         |                     ~^~~~~~~~~~~~~~~
> > > > > > > > > >         |                      |
> > > > > > > > > >         |                      struct module *
> > > > > > > > > > drivers/accel/qaic/mhi_qaic_ctrl.c:544:38: note:
> > > > > > > > > > in expansion of macro 'THIS_MODULE'
> > > > > > > > > >     544 |         mqc_dev_class =
> > > > > > > > > > class_create(THIS_MODULE,
> > > > > > > > > > MHI_QAIC_CTRL_DRIVER_NAME);
> > > > > > > > > >         |                                      ^~~~~~~~~~~
> > > > > > > > > > In file included from include/linux/device.h:31,
> > > > > > > > > >                    from include/linux/mhi.h:9,
> > > > > > > > > >                    from drivers/accel/qaic/mhi_qaic_ctrl.c:5:
> > > > > > > > > > include/linux/device/class.h:229:54: note:
> > > > > > > > > > expected 'const char *' but argument is of type
> > > > > > > > > > 'struct module *'
> > > > > > > > > >     229 | struct class * __must_check
> > > > > > > > > > class_create(const char *name);
> > > > > > > > > >         |                                          ~~~~~~~~~~~~^~~~
> > > > > > > > > > drivers/accel/qaic/mhi_qaic_ctrl.c:544:25:
> > > > > > > > > > error: too many arguments to function
> > > > > > > > > > 'class_create'
> > > > > > > > > >     544 |         mqc_dev_class =
> > > > > > > > > > class_create(THIS_MODULE,
> > > > > > > > > > MHI_QAIC_CTRL_DRIVER_NAME);
> > > > > > > > > >         |                         ^~~~~~~~~~~~
> > > > > > > > > > include/linux/device/class.h:229:29: note: declared here
> > > > > > > > > >     229 | struct class * __must_check
> > > > > > > > > > class_create(const char *name);
> > > > > > > > > >         |                             ^~~~~~~~~~~~
> > > > > > > > > >
> > > > > > > > > > Caused by commit
> > > > > > > > > >
> > > > > > > > > >     1aaba11da9aa ("driver core: class: remove
> > > > > > > > > > module * from class_create()")
> > > > > > > > > >
> > > > > > > > > > interacting with commit
> > > > > > > > > >
> > > > > > > > > >     566fc96198b4 ("accel/qaic: Add mhi_qaic_cntl")
> > > > > > > > > >
> > > > > > > > > > from the drm tree.
> > > > > > > > > >
> > > > > > > > > > I have applied the following merge fix patch for today.
> > > > > > > > > >
> > > > > > > > > > From: Stephen Rothwell <sfr@xxxxxxxxxxxxxxxx>
> > > > > > > > > > Date: Tue, 11 Apr 2023 14:16:57 +1000
> > > > > > > > > > Subject: [PATCH] fixup for "driver core: class:
> > > > > > > > > > remove module * from class_create()"
> > > > > > > > > >
> > > > > > > > > > interacting with "accel/qaic: Add mhi_qaic_cntl"
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Stephen Rothwell <sfr@xxxxxxxxxxxxxxxx>
> > > > > > > > >
> > > > > > > > > Thanks for the fixup. Since Dave is out I've made a
> > > > > > > > > note about this in my
> > > > > > > > > handover mail so it won't get lost in the drm-next
> > > > > > > > > merge window pull. I
> > > > > > > > > don't think we need any other coordination than
> > > > > > > > > mention it in each pull to
> > > > > > > > > Linus, topic tree seems overkill for this. Plus there's no way I can
> > > > > > > > > untangle the drm tree anyway :-).
> > > > > > > >
> > > > > > > > Want me to submit a patch for the drm tree that moves this to use
> > > > > > > > class_register() instead, which will make the
> > > > > > > > merge/build issue go away
> > > > > > > > for you?  That's my long-term goal here anyway, so converting this new
> > > > > > > > code to this api today would be something I have to do eventually :)
> > > > > > >
> > > > > > > We kinda closed drm-next for feature work mostly already (just pulling
> > > > > > > stuff in from subtrees), so won't really help for this merge window.
> > > > > > >
> > > > > > > For everything else I think this is up to Oded, I had no
> > > > > > > idea qaic needed
> > > > > > > it's entire own dev class and I don't want to dig into this
> > > > > > > for the risk I
> > > > > > > might freak out :-)
> > > > > > >
> > > > > > > Adding Oded.
> > > > > > >
> > > > > > > Cheers, Daniel
> > > > > >
> > > > > > Sorry for the mess.
> > > > > >
> > > > > > I made a note to update to class_register() once my drm-misc access is
> > > > > > sorted out.  Looks like we'll address the conflict in the merge
> > > > > > window, and
> > > > > > catch the update to the new API in the following release.
> > > > >
> > > > > Wait, I think the large question is, "why does this need a separate
> > > > > class"?  Why are you not using the accel char device and class?  That is
> > > > > what everything under accel/ should be using, otherwise why put it in
> > > > > there?
> > > > >
> > > > > And what exactly are you using that class for?  Just device nodes?  If
> > > > > so, how many?
> > > > >
> > > > > thanks,
> > > > >
> > > > > greg k-h
> > > >
> > > >
> > > > Remember MHI_UCI that then evolved into the WWAN subsystem?  I pointed
> > > > out at the time that AIC100/QAIC would need the same functionality.
> > > > You/Jakub told myself/Mani/Loic that a combined implementation is not
> > > > acceptable, and every area needs to implement their own version of
> > > > MHI_UCI.
> > > >
> > > > We took the WWAN subsystem and simplified it to meet our needs.
> > > >
> > > > The functionality is QAIC specific, so wedging it into the Accel node
> > > > seems to be a poor fit as it would subject Habana and iVPU to the same.
> > >
> > > Also, I forgot to mention. QAIC is sharing userspace components with WWAN,
> > > so we really cannot diverge from what WWAN has done and define a new API
> > > through the Accel node.
> >
> > So there is an accel/drm_device in the qaic driver, but there's also this
> > different class thing, which I don't get.
> >
> > And yeah if that's an entirely orthogonal thing then I guess that should
> > be in a different driver/subsystem, all supported with the aux bus to
> > multiplex the underlying device.
> >
> > I haven't found any explanation for what MHI is (or any of the other
> > acrynoms), so I'm entirely lost.
>
> MHI is documented at Documentation/mhi/
> It is also referenced in the QAIC documentation - Documentation/accel/qaic/
>
> It stands for "Modem Host Interface" (arguably a bad name now, but you can
> guess where it came from). It is a Qualcomm hardware block and associated
> software protocol that provides logical channels over a hardware link. Most
> commonly used for PCIe.
>
> Pretty much any modern Qualcomm PCIe device implements it. 4G modems, 5G
> modems, Wifi adapters, AIC100, etc. Instead of talking "PCIe", the host
> talks "MHI" to the devices in most cases.
>
> The core implementation for MHI exists in drivers/bus/mhi
>
> MHI_UCI is the MHI Userspace Character Interface. It looked like most buses
> (eg USB) provide some direct device access to userspace. MHI_UCI was
> formulated along those same lines - provide direct userspace access to a
> whitelist of channels. Qualcomm provides some fairly extensive userspace
> utilities, and various communities have developed open source alternatives
> using this mechanism.
>
> MHI_UCI was proposed to the community as the common driver (misc device) for
> all of the MHI devices. The Net folks came along, saw that it was used for
> 4G/5G modems (Wireless Wide Area Network devices or WWAN) and decided that
> they would not tolerate a common implementation. They NACK'd MHI_UCI and
> required that a WWAN specific subsystem be developed which would only
> service WWAN devices. The Net folks decreed that other subsystems which
> needed the same functionality need to have their own copy of the
> implementation.
>
> QAIC devices expose Sahara (a boot time protocol) which has an existing
> userspace that is also used with Modems, although it looks like WWAN doesn't
> currently support those generations of products today. QAIC devices also
> support DIAG, which is currently supported in WWAN. The intent was to add
> the QAIC support for DIAG at a later time since it is not required for the
> bare minimum viable driver.
>
> So, QAIC devices support the same services, would use the same userspace,
> but can't use a common implementation because Jakub(net) doesn't want to
> share and convinced Greg to go along. I'm not interested in pushing a cross
> tree fight (arguably already did that with MHI_UCI). If neither Greg nor
> Net will accept a common implementation that accelerators can use (QAIC),
> then the only place I can fit this is in the Accel area.
>
> Using aux bus seems to make little difference if QAIC is the only consumer
> of this. I'm willing to refactor the implementation with some feedback and
> guidence, but the uAPI seems set in stone due to the existing userspace and
> WWAN (char devs with open/close/read/write/poll).

Ok, so MHI _is_ the bus. Thanks for the explainer, I should have searched
a bit more in Documentation/

> What would make you less unhappy?

The MHI generic userspace driver interface needs to be in drivers/bus/mhi,
not in a random driver. I think we should revert 566fc96198b4
("accel/qaic: Add mhi_qaic_cntl") and re-land that through Greg's tree (or
wherever mhi patches go to). This of course assuming that the accel
userspace on top of the accel/drm_device does work stand-alone, and it's
just the tooling and other userspace that needs MHI_UCI. If we end with a
non-functional stack due to that, then I guess the entire driver is a bit
up for questions, because at least the accel runtime is supposed to just
run on top of the accel devnode and nothing else. Otherwise container
stuff gets really bad, among a lot of other things.

Cheers, Daniel



>
> > -Daniel
> >
> > >
> > > >
> > > > We need (eventually) 128 device nodes.  We have systems with 32 QAIC
> > > > devices, and each QAIC device uses 4 device nodes (32 * 4 = 128).  WWAN
> > > > subsystem would be similar.  Looks like each 5G modem is 6 nodes per
> > > > device, so if you had 22 5G modems on a system, you'd have 132 device
> > > > nodes.  I'm not aware of any such system, but it could exist.
> > > >
> > > > -Jeff
> > >
> >
>

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch