Re: [PATCH v4 1/3] driver core: add probe_err log helper

From: Andrzej Hajda
Date: Mon Dec 24 2018 - 04:30:06 EST


On 21.12.2018 23:47, Rob Herring wrote:
> On Thu, Dec 20, 2018 at 5:38 AM Andrzej Hajda <a.hajda@xxxxxxxxxxx> wrote:
>> On 20.12.2018 12:14, Greg Kroah-Hartman wrote:
>>> On Thu, Dec 20, 2018 at 11:22:45AM +0100, Andrzej Hajda wrote:
>>>> During probe every time driver gets resource it should usually check for error
>>>> printk some message if it is not -EPROBE_DEFER and return the error. This
>>>> pattern is simple but requires adding few lines after any resource acquisition
>>>> code, as a result it is often omited or implemented only partially.
>>>> probe_err helps to replace such code sequences with simple call, so code:
>>>> if (err != -EPROBE_DEFER)
>>>> dev_err(dev, ...);
>>>> return err;
>>>> becomes:
>>>> return probe_err(dev, err, ...);
>>> Can you show a driver being converted to use this to show if it really
>>> will save a bunch of lines and make things simpler? Usually you are
>>> requesting lots of resources so you need to do more than just return,
>>> you need to clean stuff up first.
>>
>> I have posted sample conversion patch (generated by cocci) in previous
>> version of this patchset [1].
>>
>> I did not re-posted it again as it is quite big patch and it will not be
>> applied without prior splitting it per subsystem.
>>
>> Regarding stuff cleaning: devm_* usually makes it unnecessary, but also
>> even with necessary cleaning you can profit from probe_err, you just
>> calls it without leaving probe - you have still handled correctly probe
>> deferring.
>>
>> Here is sample usage (taken from beginning of the mentioned patch):
>>
>> ---
>> diff --git a/drivers/ata/libahci_platform.c b/drivers/ata/libahci_platform.c
>> index 4b900fc659f7..52e891fe1586 100644
>> --- a/drivers/ata/libahci_platform.c
>> +++ b/drivers/ata/libahci_platform.c
>> @@ -581,11 +581,8 @@ int ahci_platform_init_host(struct platform_device *pdev,
>> int i, irq, n_ports, rc;
>>
>> irq = platform_get_irq(pdev, 0);
>> - if (irq <= 0) {
>> - if (irq != -EPROBE_DEFER)
>> - dev_err(dev, "no irq\n");
>> - return irq;
>> - }
>> + if (irq <= 0)
>> + return probe_err(dev, irq, "no irq\n");
> Shouldn't platform_get_irq (or what it calls) print the error message
> (like we do for kmalloc), rather than every driver? We could get rid
> of lots of error strings that way. I guess there are cases where no
> irq is not an error and we wouldn't want to always print an error. In
> some cases like that, we have 2 versions of the function.


kmalloc prints error and stack trace because it shows shortage of common
resource used by everyone, quite different thing than irq specific for
given device. Usually only device driver knows if error in irq acquiring
should be reported to user, and how it should be reported.

The example is for irq, but the question about the best way of reporting
error stands for all other resource acquisitions: gpios, regulators,
clocks,....

Alternative ways I see for now:

1. Do it in the consumer, like it is done now - in such case probe_err
seems to be a good helper.

2. Do it in the provider's framework, in such case framework should know
if the error should be printed:

 a) by calling special versions of all allocators,

 b) by adding extra argument to all allocators,

 c) adding extra flag to struct device (it is passed to most allocators)

3. By creating generic allocator for multiple resources, something
similar to what I have proposed few years ago in "resource tracking"
framework [1]. For example:

 ret = devm_resources_get(dev,

ÂÂÂ res_irq_desc(&ctx->irq, 0),

ÂÂÂ res_clk_desc(&ctx->clk, "bus_clk"),

ÂÂÂ res_gpio_desc(&ctx->enable_gpio, "enable", GPIOD_OUT_HIGH),

ÂÂÂ ...

 );

 Error reporting would be performed in this universal allocator.


If we want to perform brave changes I would opt for 3 - it is very
common to allocate multiple resources in probe, compacting it into one
helper should significantly simplify the code.

Option 1 is the simplest one - we do not change existing practices - so
it is the best in case of conservative approach.

I have mixed feelings about 2c, practically it looks quite tempting - we
get what we want with minimal effort, but I am not sure if polluting
struct device with 'presentation' layer is a good solution.

I do not like 2a neither 2b - alternatives between function namespace
pollution and argument list pollution.


[1]: https://lwn.net/Articles/625454/


> Not what you're addressing here exactly, but what I'd like to see is
> the ability to print the exact locations generating errors in the
> first place. That would require wrapping all the error code
> assignments and returns (or at least the common sources). If we're
> going to make tree wide changes, then that might be the better place
> to put the effort. If we had that, then maybe we'd need a lot fewer
> error messages in drivers. I did a prototype implementation and
> coccinelle script a while back that I could dust off if there's
> interest. It was helpful in finding the source of errors, but did have
> some false positives printed.


I guess that in case of resource acquisition it is usually easy to
locate place the error was reported, if the error message is informative
enough, exact line number/function name seems to me overkill.


Regards

Andrzej



>
> Rob
>
>