Re: BUG FIX: [PATCH RFC v3] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17

From: Mirsad Goran Todorovac
Date: Sat Apr 01 2023 - 06:38:39 EST


On 01. 04. 2023. 12:14, Greg KH wrote:
> On Sat, Apr 01, 2023 at 12:01:43PM +0200, Mirsad Goran Todorovac wrote:
>> On 01. 04. 2023. 11:52, Mirsad Goran Todorovac wrote:
>>> On 01. 04. 2023. 11:23, Greg KH wrote:
>>>> On Sat, Apr 01, 2023 at 11:18:19AM +0200, Greg KH wrote:
>>>>> On Sat, Apr 01, 2023 at 08:33:36AM +0200, Greg KH wrote:
>>>>>> On Sat, Apr 01, 2023 at 08:28:07AM +0200, Greg KH wrote:
>>>>>>> On Sat, Apr 01, 2023 at 08:23:26AM +0200, Mirsad Goran Todorovac wrote:
>>>>>>>>> This patch is implying that anyone who calls "dev_set_name()" also has
>>>>>>>>> to do this hack, which shouldn't be the case at all.
>>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> greg k-h
>>>>>>>>
>>>>>>>> This is my best guess. Unless there is dev_free_name() or kobject_free_name(), I don't
>>>>>>>> see a more sensible way to patch this up.
>>>>>>>
>>>>>>> In sleeping on this, I think this has to move to the driver core. I
>>>>>>> don't understand why we haven't seen this before, except maybe no one
>>>>>>> has really noticed before (i.e. we haven't had good leak detection tools
>>>>>>> that run with removable devices?)
>>>>>>>
>>>>>>> Anyway, let me see if I can come up with something this weekend, give me
>>>>>>> a chance...
>>>>>>
>>>>>> Wait, no, this already should be handled by the kobject core, look at
>>>>>> kobject_cleanup(), at the bottom. So your change should be merely
>>>>>> duplicating the logic there that already runs when the struct device is
>>>>>> freed, right?
>>>>>>
>>>>>> So I don't understand why your change works, odd. I need more coffee...
>>>>>
>>>>> I think you got half of the change correctly. This init code is a maze
>>>>> of twisty passages, let me take your patch and tweak it a bit into
>>>>> something that I think should work. This looks to be only a memstick
>>>>> issue, not a driver core issue (which makes me feel better.)
>>>>
>>>> Oops, forgot the patch. Can you try this change here and let me know if
>>>> that solves the problem or not? I have compile-tested it only, so I
>>>> have no idea if it works.
>>>>
>>>> If this does work, I'll make up a "real" function to replace the
>>>> horrible dev.kobj.name mess that a driver would have to do here as it
>>>> shouldn't be required that a driver author knows the internals of the
>>>> driver core that well...
>>>>
>>>> thanks,
>>>>
>>>> greg k-h
>>>>
>>>> --------------------
>>>>
>>>>
>>>> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
>>>> index bf7667845459..bbfaf6536903 100644
>>>> --- a/drivers/memstick/core/memstick.c
>>>> +++ b/drivers/memstick/core/memstick.c
>>>> @@ -410,6 +410,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>>>> return card;
>>>> err_out:
>>>> host->card = old_card;
>>>> + kfree_const(card->dev.kobj.name);
>>>> kfree(card);
>>>> return NULL;
>>>> }
>>>> @@ -468,8 +469,10 @@ static void memstick_check(struct work_struct *work)
>>>> put_device(&card->dev);
>>>> host->card = NULL;
>>>> }
>>>> - } else
>>>> + } else {
>>>> + kfree_const(card->dev.kobj.name);
>>>> kfree(card);
>>>> + }
>>>> }
>>>>
>>>> out_power_off:
>>>
>>> I thought of this version, but I am not sure about tracking the device_register() and
>>> device_unregister() calls?
>>>
>>> put_device() calls put_kobject() which frees the const char *kobj.name ...
>>>
>>> I thought how host cannot just be kfree()d when host->card is still allocated.
>>> And it is a pointer. That also seems to me like a bug :-/
>>>
>>> Kind regards,
>>> Mirsad
>>>
>>> ---
>>> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
>>> index bf7667845459..46c7bda9715d 100644
>>> --- a/drivers/memstick/core/memstick.c
>>> +++ b/drivers/memstick/core/memstick.c
>>> @@ -179,6 +179,8 @@ static void memstick_free(struct device *dev)
>>> {
>>> struct memstick_host *host = container_of(dev, struct memstick_host,
>>> dev);
>>> + if (host->card && host->card->dev)
>>> + put_device(&host->card->dev);
>>> kfree(host);
>>> }
>>>
>>> @@ -410,7 +412,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>>> return card;
>>> err_out:
>>> host->card = old_card;
>>> - kfree(card);
>>> + put_device(&card->dev);
>>> return NULL;
>>> }
>>>
>>> @@ -468,8 +470,9 @@ static void memstick_check(struct work_struct *work)
>>> put_device(&card->dev);
>>> host->card = NULL;
>>> }
>>> - } else
>>> - kfree(card);
>>> + } else {
>>> + put_device(&card->dev);
>>> + }
>>> }
>>>
>>> out_power_off:
>>
>> Thousand apologies, the previous version had a compilation error. I've sent the untested
>> version.
>>
>> I must have become over-confident. But they say that a mistake that makes you humbled
>> is better than success that makes you arrogant :-|
>>
>> I would like your opinion on the patch before I actually start the kernel, for I won't
>> be able to reboot clean that machine if it hangs in kernel until Tuesday :-(
>>
>> It seems that put_device() would call the release method of the device and kfree() in
>> it, but I cannot say anything about the side effects, for I do not know the source so
>> well ...
>>
>> Kind regards,
>> Mirsad
>>
>> ---
>> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
>> index bf7667845459..c63250322e26 100644
>> --- a/drivers/memstick/core/memstick.c
>> +++ b/drivers/memstick/core/memstick.c
>> @@ -179,6 +179,8 @@ static void memstick_free(struct device *dev)
>> {
>> struct memstick_host *host = container_of(dev, struct memstick_host,
>> dev);
>> + if (host->card)
>> + put_device(&host->card->dev);
>
> This isn't going to work as at this moment in time, the last reference
> count has already happened, causing this release callback to be called,
> so that the bus driver can free the memory for the device.
>
> So you would be calling put_device() on a device already has 0 for a
> reference count :)
>
>> kfree(host);
>> }
>>
>> @@ -410,7 +412,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>> return card;
>> err_out:
>> host->card = old_card;
>> - kfree(card);
>> + put_device(&card->dev);
>
> No, the device was not registered here yet, right? That would be
> required _IFF_ there was a call to device_register().
>
>> return NULL;
>> }
>>
>> @@ -468,8 +470,9 @@ static void memstick_check(struct work_struct *work)
>> put_device(&card->dev);
>> host->card = NULL;
>> }
>> - } else
>> - kfree(card);
>> + } else {
>> + put_device(&card->dev);
>
> Same here, unless I'm reading this wrong, device_register() had not been
> called yet, which is why the kfree was required (same for the above
> call).
>
> But hey, this driver really is a maze of twisty callbacks and workqueues
> and complexity, for no obvious reason to me (maybe because of some async
> requirement for memstick devices? Thankfully I no longer have this
> hardware...) So I might be totally wrong...
>
> I would recommend trying my version first, it "shouldn't" cause anything
> worse to happen from what you have today, but hey, that's just my guess.
>
> thanks,
>
> greg k-h

Hi Mr. Greg,

Thank you for the additional insight.

I will build your patch ASAP and give feedback.

Kind regards,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

"I see something approaching fast ... Will it be friends with me?"