Re: [PATCH 4/6] usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

From: Tetsuo Handa
Date: Tue Mar 09 2021 - 19:04:27 EST


On 2021/03/10 8:52, Shuah Khan wrote:
> On 3/9/21 4:40 PM, Tetsuo Handa wrote:
>> On 2021/03/10 4:50, Shuah Khan wrote:
>>> On 3/9/21 4:04 AM, Tetsuo Handa wrote:
>>>> On 2021/03/09 1:27, Shuah Khan wrote:
>>>>> Yes. We might need synchronization between events, threads, and shutdown
>>>>> in usbip_host side and in connection polling and threads in vhci.
>>>>>
>>>>> I am also looking at the shutdown sequences closely as well since the
>>>>> local state is referenced without usbip_device lock in these paths.
>>>>>
>>>>> I am approaching these problems as peeling the onion an expression so
>>>>> we can limit the changes and take a spot fix approach. We have the
>>>>> goal to address these crashes and not introduce regressions.
>>>>
>>>> I think my [PATCH v4 01/12]-[PATCH v4 06/12] simplify your further changes
>>>> without introducing regressions. While ud->lock is held when checking ud->status,
>>>> current attach/detach code is racy about read/update of ud->status . I think we
>>>> can close race in attach/detach code via a simple usbip_event_mutex serialization.
>>>>
>>>
>>> Do you mean patches 1,2,3,3,4,5,6?
>>
>> Yes, my 1,2,3,4,5,6.
>>
>> Since you think that usbip_prepare_threads() does not worth introducing, I'm fine with
>> replacing my 7,8,9,10,11,12 with your "[PATCH 0/6] usbip fixes to crashes found by syzbot".
>>
>
> Using event lock isn't the right approach to solve the race. It is a
> large grain lock. I am not looking to replace patches.

It is not a large grain lock. Since event_handler() is exclusively executed, this lock
does _NOT_ block event_handler() unless attach/detach operations run concurrently.

>
> I still haven't seen any response from you about if you were able to
> verify the fixes I sent in fix the problem you are seeing.

I won't be able to verify your fixes, for it is syzbot who is seeing the problem.
But I can see that your patch description is wrong because you are ignoring what I'm commenting.

Global serialization had better come first. Your patch description depends on global serialization.