Re: hid-related 5.2-rc1 boot hang

From: Hans de Goede
Date: Tue Jun 04 2019 - 04:08:59 EST


Hi,

On 04-06-19 09:51, Benjamin Tissoires wrote:
On Mon, Jun 3, 2019 at 4:17 PM Hans de Goede <hdegoede@xxxxxxxxxx> wrote:

Hi,

On 03-06-19 15:55, Benjamin Tissoires wrote:
On Mon, Jun 3, 2019 at 11:51 AM Hans de Goede <hdegoede@xxxxxxxxxx> wrote:

Hi Again,

On 03-06-19 11:11, Hans de Goede wrote:
<snip>

not sure about the rest of logitech issues yet) next week.

The main problem seems to be the request_module patches. Although I also

Can't we use request_module_nowait() instead, and set a reasonable
timeout that we detect only once to check if userspace is compatible:

In pseudo-code:
if (!request_module_checked) {
request_module_nowait(name);
use_request_module = wait_event_timeout(wq,
first_module_loaded, 10 seconds in jiffies);
request_module_checked = true;
} else if (use_request_module) {
request_module(name);
}

Well looking at the just attached dmesg , the modprobe
when triggered by udev from userspace succeeds in about
0.5 seconds, so it seems that the modprobe hangs happens
when called from within the kernel rather then from within
userspace.

What I do not know if is the hang is inside userspace, or
maybe it happens when modprobe calls back into the kernel,
if the hang happens when modprobe calls back into the kernel,
then other modprobes (done from udev) likely will hang too
since I think only 1 modprobe can happen at a time.

I really wish we knew what distinguished working systems
from non working systems :|

I cannot find a common denominator; other then the systems
are not running Fedora. So far we've reports from both Ubuntu 16.04
and Tumbleweed, so software version wise these 2 are wide apart.

I am trying to reproduce the lock locally, and installed an opensuse
Tumbleweed in a VM. When forwarding a Unifying receiver to the VM, I
do not see the lock with either my vanilla compiled kernel and the rpm
found in http://download.opensuse.org/repositories/Kernel:/HEAD/standard/x86_64/

Next step is install Tumbleweed on bare metal, but I do not see how
this could introduce a difference (maybe USB2 vs 3).

Ok, thank you for looking into this.

have 2 reports of problems with hid-logitech-dj driving the 0xc52f product-id,
so we may need to drop that product-id from hid-logitech-dj, I'm working on
that one...

Besides the modprobe hanging issue, the only other issues all
(2 reporters) seem to be with 0xc52f receivers. We have a bug
open for this:

https://bugzilla.kernel.org/show_bug.cgi?id=203619

And I've asked the reporter of the second bug to add his logs
to that bug.

We should likely just remove c52f from the list of supported devices.
C52f receivers seem to have a different firmware as they are meant to
work with different devices than C534. So I guess it is safer to not
handle those right now and get the code in when it is ready.

Ack. Can you prepare a patch to drop the c52f id?

Yes. I have an other revert never submitted that I need to push, so I
guess I can do a revert session today.

I think I'll also buy one device with hopefully the C52F receiver as
the report descriptors attached in
https://bugzilla.kernel.org/show_bug.cgi?id=203619 seems different to
what I would have expected.

They are actually what I expected :)

The first USB interface is a mouse boot class device, since this is a mouse
only receiver. This means that the mouse report is unnumbered and we need to
extend the unnumbered mouse-report handling to handle this case. Also the
device is using the same highres mouse-reports as the gaming receiver is.

I'm actually preparing a patch right now which should fix this. Still might
be better to do the revert for 5.2 and get proper support for the c52f
receiver into 5.3.

Regards,

Hans