Re: [PATCH v10 1/4] HID: hid-msi: Add MSI Claw configuration driver
From: Derek John Clark
Date: Fri May 29 2026 - 02:39:25 EST
On Wed, May 27, 2026 at 10:32 PM Dmitry Torokhov
<dmitry.torokhov@xxxxxxxxx> wrote:
>
> Hi Derek,
>
> On Wed, May 27, 2026 at 10:21:19PM +0000, Derek J. Clark wrote:
> > Adds configuration HID driver for the MSI Claw series of handheld PC's.
> > In this initial patch add the initial driver outline and attributes for
> > changing the gamepad mode, M-key behavior, and add a WO reset function.
> >
> > Sending the SWITCH_MODE and RESET commands causes a USB disconnect in
> > the device. The completion will therefore never get hit and would trigger
> > an -EIO. To avoid showing the user an error for every write to these
> > attrs a bypass for the completion handling is introduced when timeout ==
> > 0.
> >
> > The initial version of this patch was written by Denis Benato, which
> > contained the initial reverse-engineering and implementation for the
> > gamepad mode switching. This work was later expanded by Zhouwang Huang
> > to include more gamepad modes. Finally, I refactored the drivers data
> > in/out flow and overall format to conform to kernel driver best
> > practices and style guides. Claude was used as an initial reviewer of
> > this patch.
>
> I wonder why do you need to roll asynchronous probing and asynchronous
> resume by hand? This I think complicates the driver greatly and forces
> you to use a ton of works, spinlocks, and checks.
>
> Thanks.
>
Hi Dmitry,
I suppose being asked this means my cover letter and commit
descriptions need some additional context. The MCU in these Claw
devices is quite temperamental. There are a few specific issues that
cause the need for multiple work queues, a serialization mutex, and
subsequently spinlocks to prevent stale data reads.
1.) The MCU will halt function if it receives any output reports
before ~500MS after probe or resume. This can either manifest as the
device never responding to a command, or it can cause the entire
system to become unstable and reboot. This creates the need for
cfg_setup to query the MCU and then add the gamepad attrs, led_mc
device, and rgb attrs. As a side effect, because a system could
technically be suspended during that 500ms delay, there exists the
need to re-queue the work if it was never triggered, hence the resume
queue.
2.) The MCU will not always respond in order if two or more output
reports are sent within a few ms of each other. Since many of the
commands use a generic "ACK", or share an "ACK" type but don't provide
specific context about what sub-function called them, we could
potentially have cross talk where data is saved in the wrong attribute
or errors propagate because of a missed message. To get around this
serialization issue we hold a mutex through a completion triggered in
raw_event and, for most events, save a state machine on what command
is expected and what sub-command was the initiator. (I.E. profile
events handle the M1, M2, RGB, Left rumble, and right rumble). Since
the state machine is accessed on both sides, we need spinlocks
guarding the reads. This essentially serializes the data and makes it
predictable. Using this pattern I haven't had any issues reading from
or writing to the MCU.
3.) Some commands will never return their "ACK" while a completion is
held, so we have a workaround to basically ignore them and hope the
command worked. This is only needed for SYNC_TO_ROM, from which we
don't need to set anything on its "ACK", and switching the gamepad
mode, which causes USB disconnect/reconnect and the driver fully
reloads, so we'll never be able to read it anyway.
4.) The RGB work queue is used to free the userspace write while the
completion is held. I found that use without it could stall userspace
quite significantly if it has multiple writes back to back. I
experienced this using Steam's customization menu, which sends a
single write for every increment of its color and brightness sliders.
when traversing the full length of the slider it is possible to have
effects changing for nearly a minute after stopping. With the queue,
only the most recent write is eventually sent to the MCU. This issue
also affects the Go 2 driver as well, though not to the same extent,
but for which I'll be adding a similar de-bounce queue soon. Go S is
also technically affected by this bug, but that returns quickly enough
that it isn't really feasible to trigger the bug with much frequency.
I'll still fix that one as well though.
TBH I'm not "happy" with the complexity of the driver, but I don't see
a reasonable alternative. If you have any specific suggestions that I
could try that might simplify it, I'd be more than happy to give it a
shot. That being said, I'm not very optimistic about it. Development
on this device has been like wrestling a bear.
- Derek
> --
> Dmitry