Re: general protection fault in devlink_info_serial_number_put

From: Vincent Mailhol
Date: Tue Feb 04 2025 - 09:59:47 EST


On 04/02/2025 at 23:41, Eric Dumazet wrote:
> On Tue, Feb 4, 2025 at 3:09 PM Vincent Mailhol
> <mailhol.vincent@xxxxxxxxxx> wrote:
>>
>> +To: Jiri Pirko
>> +To: Jakub Kicinski
>> +CC: David S. Miller
>> +CC: Eric Dumazet
>> +CC: Paolo Abeni
>> +CC: Simon Horman
>> +CC: netdev@xxxxxxxxxxxxxxx
>>
>> On 04/02/2025 at 16:44, YAN KANG wrote:
>>> Dear developers and maintainers,
>>>
>>> I found a new kernel NULL-Pointer-Dereference bug titiled "general protection fault in devlink_info_serial_number_put" while using modified syzkaller fuzzing tool. I Itested it on the latest Linux upstream version (6.13.0-rc7)related to ETAS ES58X CAN/USB DRIVER, and it was able to be triggered.
>>>
>>> The bug info is:
>>>
>>> kernel revision: v6.13-rc7
>>> OOPS message: general protection fault in devlink_info_serial_number_put
>>> reproducer:YES
>>>
>>> After preliminary analysis, The root casue may be :
>>> in the function: es58x_devlink_info_get drivers/net/can/usb/etas_es58x/es58x_devlink.c
>>> es58x_dev->udev->serial == NULL ,but no check for it.
>>>
>>> devlink_info_serial_number_put(req, es58x_dev->udev->serial) triggers NPD .
>>>
>>> Fix suggestion: Check es58x_dev->udev->serial before deference pointer.
>>
>> Thanks for the report. I acknowledge the issue: the serial number of a
>> USB device may be NULL and I forget to check this condition.
>>
>> @netdev and devlink maintainers
>>
>> I can of course fix this locally, but this that this kind of issue looks
>> like some nasty pitfall to me. So, I was wondering if it wouldn't be
>> safer to add the NULL check in the framework instead of in the device.
>> The netlink is not part of the hot path, so a NULL check should not have
>> performance impacts.
>>
>> I am thinking of:
>>
>> diff --git a/include/net/netlink.h b/include/net/netlink.h
>> index e015ffbed819..eaee9a1aa91f 100644
>> --- a/include/net/netlink.h
>> +++ b/include/net/netlink.h
>> @@ -1617,6 +1617,8 @@ static inline int nla_put_sint(struct sk_buff
>> *skb, int attrtype, s64 value)
>> static inline int nla_put_string(struct sk_buff *skb, int attrtype,
>> const char *str)
>> {
>> + if (!str)
>> + return 0;
>> return nla_put(skb, attrtype, strlen(str) + 1, str);
>> }
>>
>> Of course, it is also possible to do the check in
>> devlink_info_serial_number_put().
>>
>> What do you think?
>
> Please fix the caller.
>
> nla_put_string() is not supposed to be called with a NULL str.
>
> Next time, we will have someone adding a test about a NULL skb.
>
> Adding such tests could hide real bugs.

Understood. Thanks for the quick feedback, I will fix it locally.


Yours sincerely,
Vincent Mailhol