Re: [PATCH] nfc: virtual_ncidev: Add variable to check if ndev is running

From: Krzysztof Kozlowski
Date: Mon Nov 20 2023 - 13:32:18 EST


On 20/11/2023 19:29, Krzysztof Kozlowski wrote:
> On 20/11/2023 19:23, Phi Nguyen wrote:
>> On 11/20/2023 6:45 PM, Krzysztof Kozlowski wrote:
>>> On 20/11/2023 11:39, Nguyen Dinh Phi wrote:
>>>>>>> mutex_lock(&vdev->mtx);
>>>>>>> kfree_skb(vdev->send_buff);
>>>>>>> vdev->send_buff = NULL;
>>>>>>> + vdev->running = false;
>>>>>>> mutex_unlock(&vdev->mtx);
>>>>>>>
>>>>>>> return 0;
>>>>>>> @@ -50,7 +55,7 @@ static int virtual_nci_send(struct nci_dev *ndev, struct sk_buff *skb)
>>>>>>> struct virtual_nci_dev *vdev = nci_get_drvdata(ndev);
>>>>>>>
>>>>>>> mutex_lock(&vdev->mtx);
>>>>>>> - if (vdev->send_buff) {
>>>>>>> + if (vdev->send_buff || !vdev->running) {
>>>>>>
>>>>>> Dear Krzysztof,
>>>>>>
>>>>>> I agree this defensive code.
>>>>>> But i think NFC submodule has to avoid this situation.(calling send function of closed nci_dev)
>>>>>> Could you check this?
>>>>>
>>>>> This code looks not effective. At this point vdev->send_buff is always
>>>>> false, so the additional check would not bring any value.
>>>>>
>>>>> I don't see this fixing anything. Syzbot also does not seem to agree.
>>>>>
>>>>> Nguyen, please test your patches against syzbot *before* sending them.
>>>>> If you claim this fixes the report, please provide me the link to syzbot
>>>>> test results confirming it is fixed.
>>>>>
>>>>> I looked at syzbot dashboard and do not see this issue fixed with this
>>>>> patch.
>>>>>
>>>>> Best regards,
>>>>> Krzysztof
>>>>>
>>>>
>>>> Hi Krzysztof,
>>>>
>>>> I've submitted it to syzbot, it is the test request that created at
>>>> [2023/11/20 09:39] in dashboard link
>>>> https://syzkaller.appspot.com/bug?extid=6eb09d75211863f15e3e
>>>
>>> ...and I see there two errors.
>>>
>> These are because I sent email wrongly and syzbot truncates the patch
>> and can not compile
>>
>>> I don't know, maybe I miss something obvious (our brains like to do it
>>> sometimes), but please explain me how this could fix anything?
>>>
>>> Best regards,
>>> Krzysztof
>>>
>>
>> The issue arises when an skb is added to the send_buff after invoking
>> ndev->ops->close() but before unregistering the device. In such cases,
>> the virtual device will generate a copy of skb, but with no consumer
>> thereafter. Consequently, this object persists indefinitely.
>>
>> This problem seems to stem from the existence of time gaps between
>> ops->close() and the destruction of the workqueue. During this interval,
>> incoming requests continue to trigger the send function.
>
> I asked how this could fix anything. Can you respond to my original comment?
>
> Look:
>
>>>>> This code looks not effective. At this point vdev->send_buff is always
>>>>> false, so the additional check would not bring any value.

Uh, now I see, I missed that's the if here is for send_buff != NULL, so
quite different case than virtual_nci_close().

Best regards,
Krzysztof