Re: [regression] Since kernel 6.3.1 logitech unify receiver not working properly

From: Linus Torvalds
Date: Mon May 22 2023 - 14:24:25 EST


On Mon, May 22, 2023 at 5:38 AM Linux regression tracking (Thorsten
Leemhuis) <regressions@xxxxxxxxxxxxx> wrote:
>
> FWIW, in case anybody is interested in a status update: one reporter
> bisected the problem down to 586e8fede79 ("HID: logitech-hidpp: Retry
> commands when device is busy"); reverting that commit on-top of 6.3
> fixes the problem for that reporter. For that reporter things also work
> on 6.4-rc; but for someone else that is affected that's not the case.

Hmm. It's likely timing-dependent.

But that code is clearly buggy.

If the wait_event_timeout() returns early, the device hasn't replied,
but the code does

if (!wait_event_timeout(hidpp->wait, hidpp->answer_available,
5*HZ)) {
dbg_hid("%s:timeout waiting for response\n", __func__);
memset(response, 0, sizeof(struct hidpp_report));
ret = -ETIMEDOUT;
}

and then continues to look at the response _anyway_.

Now, depending on out hardening options, that response may have been
initialized by the compiler, or may just be random stack contents.

That bug is pre-existing (ie the problem was not introduced by that
commit), but who knows if the retry makes things worse (ie if it then
triggers on a retry, the response data will be the *previous*
response).

The whole "goto exit" games should be removed too, because we're in a
for-loop, and instead of "goto exit" it should just do "break".

IOW, something like this might be worth testing.

That said, while I think the code is buggy, I doubt this is the actual
cause of the problem people are reporting. But it would be lovely to
hear if the attached patch makes any difference, and I think this is
fixing a real - but unlikely - problem anyway.

And obviously it might be helpful to actually enable those dbg_hid()
messages, but I didn't look at what the magic config option to do so
was.

NOTE! Patch below *ENTIRELY* untested. I just looked at the code when
that commit was mentioned, and went "that's not right"...

Linus
drivers/hid/hid-logitech-hidpp.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c
index 0fcfd85fea0f..72bd62d2f984 100644
--- a/drivers/hid/hid-logitech-hidpp.c
+++ b/drivers/hid/hid-logitech-hidpp.c
@@ -306,7 +306,7 @@ static int hidpp_send_message_sync(struct hidpp_device *hidpp,
if (ret) {
dbg_hid("__hidpp_send_report returned err: %d\n", ret);
memset(response, 0, sizeof(struct hidpp_report));
- goto exit;
+ break;
}

if (!wait_event_timeout(hidpp->wait, hidpp->answer_available,
@@ -314,13 +314,14 @@ static int hidpp_send_message_sync(struct hidpp_device *hidpp,
dbg_hid("%s:timeout waiting for response\n", __func__);
memset(response, 0, sizeof(struct hidpp_report));
ret = -ETIMEDOUT;
+ break;
}

if (response->report_id == REPORT_ID_HIDPP_SHORT &&
response->rap.sub_id == HIDPP_ERROR) {
ret = response->rap.params[1];
dbg_hid("%s:got hidpp error %02X\n", __func__, ret);
- goto exit;
+ break;
}

if ((response->report_id == REPORT_ID_HIDPP_LONG ||
@@ -329,13 +330,12 @@ static int hidpp_send_message_sync(struct hidpp_device *hidpp,
ret = response->fap.params[1];
if (ret != HIDPP20_ERROR_BUSY) {
dbg_hid("%s:got hidpp 2.0 error %02X\n", __func__, ret);
- goto exit;
+ break;
}
dbg_hid("%s:got busy hidpp 2.0 error %02X, retrying\n", __func__, ret);
}
}

-exit:
mutex_unlock(&hidpp->send_mutex);
return ret;