Re: [PATCH 3/3] EDAC: amd64_edac: decide if driver can load successfully early.

From: Tejun Heo
Date: Thu Mar 19 2015 - 13:47:50 EST


Hello, Borislav.

On Thu, Mar 19, 2015 at 06:27:10PM +0100, Borislav Petkov wrote:
> Ok, since when is a driver returning !0 from its init routine and thus
> not registering, wrong? And my "precious" driver, as you put it, is by
> far not the only one.

The synchronicity actually caused problems with certain storage
drivers which may take a very long time probing and userland timing
out. We've never been really clear about the two. Traditionally most
drivers have been synchronous and that worked fine as the hw
configurations have been mostly static. As things get more dynamic,
decoupling module loading and probing becomes necessary. Some
interactions aren't immediately obvious. e.g. because userland is now
more involved in handling hotplug event and thus driver / device
management, they get also more involved with managing modules which in
turn mmakes things like insmod taking five minutes problematic.

> And since when am I the bad guy for wanting to not confuse my users by
> simply not loading the driver if there's no need for it?
>
> [ And yes, I have had bug reports of people saying amd64_edac is loaded
> but why am I not getting any errors reported. ]

The same reason that loading e1000 doesn't give the user the matching
ethernet port unless the hardware is actually available. Please read
on.

> And yes, it is dumb to load drivers and leave them loaded even when
> there's no need for them/no use. Not from some layering/driver
> model/whatever technical perspective but from a simple usability
> standpoint.

That could be a valid point but you can't just go off and implement
that behavior specifically for your driver in a custom way. If you
think the current overall behavior is bad, please justify your
reasoning and work on a generic solution. The specific behavior isn't
the core problem here. The way your driver deviates in an one-off way
is.

> If I do lsmod and see a bunch of drivers but don't know what is used or
> not, then we have failed.

Please see above.

> And why are you wasting so much time with debating this?

Because I've spent so much time and effort hunting down one-off cases
like this all over the place while working on the driver model,
workqueue, freezer and so on. These things do add up and shut off
possibilities of a lot more meaningful possibilities. This is the
wrong trade off to make.

> If the driver init routine would do
>
> if (!ecc_supported)
> return -EINVAL;
>
> would be fine for you but if it did the same thing in a more involved
> manner, then that's a problem?

The way -probe() behaves across different drivers is kind of a
scattershot right now but at least the above would have been easy to
spot and understand unlike the current hack. Again, if you think this
is a problem worth solving and it might as well be, please do so in a
generic manner.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/