Re: [RFC PATCH 8/9] surface_aggregator: Add DebugFS interface

From: Maximilian Luz
Date: Wed Sep 23 2020 - 18:06:59 EST


On 9/23/20 8:29 PM, Greg Kroah-Hartman wrote:
On Wed, Sep 23, 2020 at 08:03:38PM +0200, Maximilian Luz wrote:
On 9/23/20 6:14 PM, Greg Kroah-Hartman wrote:

[...]

So the -EFAULT returned by put_user should have precedence? I was aiming
for "in case it fails, return with the first error".

-EFAULT trumps everything :)

Perfect, thanks!

Listen, I'm all for doing whatever you want in debugfs, but why are you
doing random ioctls here? Why not just read/write a file to do what you
need/want to do here instead?

Two reasons, mostly: First, the IOCTL allows me to execute requests in
parallel with just one open file descriptor and not having to maintain
some sort of back-buffer to wait around until the reader gets to reading
the thing. I've used that for stress-testing the EC communication in the
past, which had some issues (dropping bytes, invalid CRCs, ...) under
heavy(-ish) load. Second, I'm considering adding support for events to
this device in the future by having user-space receive events by reading
from the device. Events would also be enabled or disabled via an IOCTL.
That could be implemented in a second device though. Events were also my
main reason for adding a version to this interface: Discerning between
one that has event support and one that has not.

A misc device can also do this, much simpler, right? Why not use that?

Sorry to ask so many questions, just want to make sure I understand you
correctly:

- So you suggest I go with a misc device instead of putting this into
debugfs?

- And I keep the IOCTL?

- Can I still tell people to not use it and that it's not my fault if a
change in the interface breaks their tools if it's not in debugfs?

- Also load it via a separate module (module_misc_device, I assume)?

One reason why the platform_device approach is practical in this
scenario is that I can leverage the driver core to defer probing and
thus defer creating the device if the controller isn't there yet.
Similarly, the driver is automatically unbound if the controller goes
away and the device should be destroyed. All of this should currently be
handled via the device link created by ssam_client_bind() (unless I
really misunderstood those).

I should be able to handle that by having the device refuse to open the
file if the controller isn't there. Holding the state-lock during the
request execution should ensure that the controller doesn't get shut
down.

Oh, after fixing up the issues that Arnd pointed out of course :)

Sure :)

+static void ssam_dbg_device_release(struct device *dev)
+{
+ // nothing to do

That's a lie, and the old documentation would allow me to make fun of
you for trying to work around the kernel's error messages here.

But I'll be nice and just ask, why do you think it is ok to work around
a message that someone has spent a lot of time and energy to provide to
you, saying that you are doing something wrong, by ignoring that and
providing an empty function? Not kind...

Sorry about that, but may get a pointer to that particular message? This
setup has been pretty much copied from existing kernel drivers (see
/drivers/platform/x86/intel_pmc_core_pltdrv.c for one)

Oh wow, time to go yell at people, thanks for pointing that out...

and I thought
that I can get around having to dynamically allocate a platform device
since it's guaranteed to be only there once.

There was no workaround or unkindness of any sorts intended.

See device_release() in drivers/base/core.c for the error message you
would have gotten that this empty function "works around".

Ah, Documentation/core-api/kobject.rst does explain things a lot better,
thanks! I was only looking at the driver, device, and platform docs so
that's something I've missed completely...

+}
+
+static struct platform_device ssam_dbg_device = {
+ .name = SSAM_DBG_DEVICE_NAME,
+ .id = PLATFORM_DEVID_NONE,
+ .dev.release = ssam_dbg_device_release,
+};

Dynamic structures that are static are, well, wrong :)

I assume the correct way would be to allocate the device dynamically and
this holds for all devices?

Sorry if I'm asking such basic questions, but I have not found anything
regarding this in the documentation, although I have to confess that I
only skimmed over a larger part, so that's very likely my fault.

I appreciate the initiative by creating a fake platform device and
driver to bind to that device. But I don't think any of it is needed at
all, you have made your work a lot harder than you needed to here. This
whole file can be _much_ smaller and simpler and not abuse the kernel
apis so badly :)

So just tack it onto the core driver? My intention was to keep it a bit
more separate from the core, but adding it directly would indeed reduce
the amount of code.

A simple misc device would make it very simple and easy to do instead,
why not do that?

Again, I considered the probe deferring of the platform driver fairly
handy (in addition to having the implicit debugfs warning of "don't rely
on this"), but if you prefer me implementing this as misc device, I'll
do that.

Thanks,
Max