Re: Bluetooth broken for some people with 6.8.2 [Was: [PATCH 6.8 308/715] Bluetooth: hci_core: Cancel request on command timeout]

From: Linux regression tracking (Thorsten Leemhuis)
Date: Tue Apr 02 2024 - 02:42:55 EST


On 30.03.24 17:23, Greg KH wrote:
> On Sat, Mar 30, 2024 at 03:59:22PM +0100, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 24.03.24 23:28, Sasha Levin wrote:
>>> From: Luiz Augusto von Dentz <luiz.von.dentz@xxxxxxxxx>
>>>
>>> [ Upstream commit 63298d6e752fc0ec7f5093860af8bc9f047b30c8 ]
>>>
>>> If command has timed out call __hci_cmd_sync_cancel to notify the
>>> hci_req since it will inevitably cause a timeout.
> [...]
>>> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@xxxxxxxxx>
>>> Stable-dep-of: 2615fd9a7c25 ("Bluetooth: hci_sync: Fix overwriting request callback")
>>> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
>>
>> Hey stable team, I wonder if it might be wise to pick up 1c3366abdbe884
>> ("Bluetooth: hci_sync: Fix not checking error on hci_cmd_sync_cancel_sync") from next
>> (https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=1c3366abdbe884)
>> for the next releases of all series that a few days ago received
>> 63298d6e752fc0 ("Bluetooth: hci_core: Cancel request on command timeout").
>>
>> The latter patch sadly on quite a few systems causes a Oops due to a
>> NULL pointer dereference and breaks Bluetooth. This was reported for
>> mainline here (yes, coincidentally it was reported by yours truly):
>> https://lore.kernel.org/all/08275279-7462-4f4a-a0ee-8aa015f829bc@xxxxxxxxxxxxx/
>>
>> Now that the patch landed in 6.8.2 it seems to happen there as well
>> (guess in 6.7 and others, too), as can be seen from this bug report
>> where multiple people already joined:
>> https://bugzilla.kernel.org/show_bug.cgi?id=218651
> [...]
> Now queued up, thanks for letting us know.

FWIW, at least one user reported additional BT problems in bugzilla that
might or might not be related to the backports. But I write for a
different reason:

Luiz replied in bugzilla
(https://bugzilla.kernel.org/show_bug.cgi?id=218651#c20) and you might
want to know about it:

"'"
Hmm, was the original change [63298d6e752fc0 ("Bluetooth: hci_core:
Cancel request on command timeout")] backported to stable kernels, afaik
I didn't mark it to Cc stable: [...]

I wonder why it got selected to be backported, in any case I don't think
it is a good idea to attempt to do backporting without having at least a
Fixes tag to begin with otherwise we risk having problems like this
widespread to people not really running the latest where this sort of
problem is sort of expected during the early rc phase, so instead of
having these 2 patches backported we could just remove the above from
the stable trees.
"'"

Luiz: Sasha and Greg can speak for themselves, but the "Stable-dep-of:
2615fd9a7c25 ("Bluetooth: hci_sync: Fix overwriting request callback")"
tag above is a strong indicator why 63298d6e752fc0 was backported.

Ciao, Thorsten (who now hopes the developers sort this out without him
as accidentally man-in-the-middle)