Re: [PATCH v2 1/1] mux: mux-core: Add NULL check for dev->of_node

From: Peter Rosin
Date: Wed Jul 12 2017 - 05:15:05 EST


On 2017-07-10 23:36, sathyanarayanan kuppuswamy wrote:
> Hi,
>
>
> On 07/10/2017 12:40 AM, Peter Rosin wrote:
>> On 2017-07-09 09:35, Kuppuswamy, Sathyanarayanan wrote:
>>> Hi,
>>>
>>>
>>> On 7/9/2017 12:07 AM, Peter Rosin wrote:
>>>> On 2017-07-09 01:12, Kuppuswamy, Sathyanarayanan wrote:
>>>>> Hi Peter,
>>>>>
>>>>>
>>>>> On 7/8/2017 2:00 PM, Peter Rosin wrote:
>>>>>> On 2017-07-07 23:46, sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx wrote:
>>>>>>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
>>>>>>>
>>>>>>> If dev->of_node is NULL, then calling mux_control_get()
>>>>>>> function can lead to NULL pointer exception. So adding
>>>>>>> a NULL check for dev->of_node.
>>>>>>>
>>>>>>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
>>>>>> Do you have a driver that might call mux_control_get and not have any
>>>>>> of_node?
>>>>> For non-device tree drivers, this case is valid. I hit this issue when I
>>>>> was working on Intel USB MUX driver.
>>>>>> If not, I don't see the point of this check.
>>>>> Since this is an API for other consumers, I think its better to have
>>>>> some sanity checks.
>>>>>
>>>>> If a non device tree driver call this API , I think its better to fail
>>>>> with some error no instead of creating null pointer exception.
>>>> Is it? When authoring a new driver, and you make some error like this, why
>>>> is a "nice" error better than a big fat fail? If you get a null deref,
>>>> you will presumably also get a call stack etc, which will help you find
>>>> where you made the error, w/o adding a bunch of traces to find out exactly
>>>> what you did wrong.
>>> In this case, I think the error can happen even if there is any
>>> configuration mismatch in device tree blob. So its not only
>> No, it cannot happen for any of the existing consumers. And seeing crashes
>> if the DT blob is faulty isn't unexpected. I'm sure the ways to provoke a
>> crash in that situation are abundant.
> If we can prevent the crash and fail with some meaning full message, I think
> we should adapt to it. if you want to leave a strong message, my
> suggestion is to
> use BUG or WARN_ON.
>>
>>> about API usage in driver. If you think that you need to fail the system
>>> if the API is used without proper dt node configuration,
>>> then we should use something like BUG or WARN_ON to explicitly mention
>>> this dependency. But I think its better to
>> But this is something that *cannot* happen unless you add some new code
>> somewhere else. Why check at all?
> Its not just kernel code here, DT tables are also involved.

No, DT is not involved. See, *all* callers (or both, as it stands) already
check this before they make the call.

So no, this *cannot* happen unless you add some new code somewhere else.

>>
>>> leave this decision to the MUX consumers because there use cases where
>>> MUX control can be optional
>> This is only future-proofing for consumers that does not exist, and does
>> nothing for the existing code. And the future where this is needed might
>> never happen.
>>
>> Still skeptic...
> I agree with philosophy of not bloating the kernel by merging un-used
> code. But I think this concept should be limited to adding new APIs or
> drivers. But for bugs in existing code I am not sure whether we should
> follow the same principle.

There's no bug, and this patch is not a bug-fix. It's a belt-and-suspenders
patch.

In the future, it might turn into a bug-fix if someone has added a call
from a driver that does not require DT for something else. But then the
onus is on the author of that driver to test that it actually works without
DT. So, the question is how that author would like this to not work when
these tests are made. If the kernel crashes, it's bleeding obvious. If
you add a WARN, the results are pretty much the same. Sure, the kernel
continues, but the author simply cannot ship things in that state anyway,
and has to come up with some sort of fix. And that fix is some mix of:
- something like your patch, and falling back to something else on failure
- making the mux_control_get call optional
- adding some sort of implementation that provides some other way to look
up the intended mux-control.

(BTW, adding a BUG for this is not appropriate. And Linus hates those...)

> I have checked some use cases of dev->of_node in kernel, and most of
> them who uses it in
> kernel APIs have some form of protection check for it. For example, take
> a look at the
> following functions.
>
> int platform_get_irq(struct platform_device *dev, unsigned int num)
> bool device_dma_supported(struct device *dev)
> struct pwm_device *pwm_get(struct device *dev, const char *con_id)
>
> These APIs have checks like,
>
> if (IS_ENABLED(CONFIG_OF) && dev->of_node)

Those APIs are used a lot more, and have probably evolved from a time when
DT was not the norm for non-discoverable systems.

> If you are still skeptic, I am out of arguments -:)

I just get the feeling that this is something that is never going to
be needed. When a non-DT driver needs a mux, this will come naturally.
Until then, who cares?

Cheers,
Peter

>> Cheers,
>> peda
>>
>>>> So, I'm skeptic...
>>>>
>>>> Cheers,
>>>> peda
>>>>
>>>>>> Cheers,
>>>>>> peda
>>>>>>
>>>>>>> ---
>>>>>>> drivers/mux/mux-core.c | 3 +++
>>>>>>> 1 file changed, 3 insertions(+)
>>>>>>>
>>>>>>> Changes since v1:
>>>>>>> * Removed dummy new line.
>>>>>>>
>>>>>>> diff --git a/drivers/mux/mux-core.c b/drivers/mux/mux-core.c
>>>>>>> index 90b8995..924c983 100644
>>>>>>> --- a/drivers/mux/mux-core.c
>>>>>>> +++ b/drivers/mux/mux-core.c
>>>>>>> @@ -438,6 +438,9 @@ struct mux_control *mux_control_get(struct device *dev, const char *mux_name)
>>>>>>> int index = 0;
>>>>>>> int ret;
>>>>>>>
>>>>>>> + if (!np)
>>>>>>> + return ERR_PTR(-ENODEV);
>>>>>>> +
>>>>>>> if (mux_name) {
>>>>>>> index = of_property_match_string(np, "mux-control-names",
>>>>>>> mux_name);
>>>>>>>
>>
>