Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

From: Bastien Nocera
Date: Wed May 31 2023 - 04:47:00 EST


On Mon, 2023-05-22 at 11:23 -0700, Linus Torvalds wrote:
> On Mon, May 22, 2023 at 5:38 AM Linux regression tracking (Thorsten
> Leemhuis) <regressions@xxxxxxxxxxxxx> wrote:
> >
> > FWIW, in case anybody is interested in a status update: one
> > reporter
> > bisected the problem down to 586e8fede79 ("HID: logitech-hidpp:
> > Retry
> > commands when device is busy"); reverting that commit on-top of 6.3
> > fixes the problem for that reporter. For that reporter things also
> > work
> > on 6.4-rc; but for someone else that is affected that's not the
> > case.
>
> Hmm. It's likely timing-dependent.
>
> But that code is clearly buggy.
>
> If the wait_event_timeout() returns early, the device hasn't replied,
> but the code does
>
>                 if (!wait_event_timeout(hidpp->wait, hidpp-
> >answer_available,
>                                         5*HZ)) {
>                         dbg_hid("%s:timeout waiting for response\n",
> __func__);
>                         memset(response, 0, sizeof(struct
> hidpp_report));
>                         ret = -ETIMEDOUT;
>                 }
>
> and then continues to look at the response _anyway_.
>
> Now, depending on out hardening options, that response may have been
> initialized by the compiler, or may just be random stack contents.

It's kzalloc()'ed in the 2 places it's used, hidpp_send_message_sync().

> That bug is pre-existing (ie the problem was not introduced by that
> commit), but who knows if the retry makes things worse (ie if it then
> triggers on a retry, the response data will be the *previous*
> response).
>
> The whole "goto exit" games should be removed too, because we're in a
> for-loop, and instead of "goto exit" it should just do "break".
>
> IOW, something like this might be worth testing.
>
> That said, while I think the code is buggy, I doubt this is the
> actual
> cause of the problem people are reporting. But it would be lovely to
> hear if the attached patch makes any difference, and I think this is
> fixing a real - but unlikely - problem anyway.
>
> And obviously it might be helpful to actually enable those dbg_hid()
> messages, but I didn't look at what the magic config option to do so
> was.

Thomas Weißschuh's patch ("HID: use standard debug APIs") linked all
those debug calls to the dynamic debugging system, so something like
this will work after boot:
echo 'file hid-logitech-hidpp.c +p' > /sys/kernel/debug/dynamic_debug/control

Adding this to the kernel command-line to get some debug during boot
should work:
dyndbg="file hid-logitech-hidpp.c +p"

In both cases, check it's enabled and that the messages can be printed
with:
grep -i hidpp /sys/kernel/debug/dynamic_debug/control

> NOTE! Patch below *ENTIRELY* untested. I just looked at the code when
> that commit was mentioned, and went "that's not right"...

I sent a similar patch before seeing your version, in answer to a
separate report I was sent. It doesn't change the style of the code,
and just fixes that one omission:
https://patchwork.kernel.org/project/linux-input/patch/20230531082428.21763-1-hadess@xxxxxxxxxx/

Cheers

>
>                      Linus