Re: [PATCH] platform/chrome: cros_ec_proto: Lock device when updating MKBP version

From: Patryk Duda
Date: Mon Jul 29 2024 - 07:57:32 EST


On Mon, Jul 29, 2024 at 5:47 AM Tzung-Bi Shih <tzungbi@xxxxxxxxxx> wrote:
>
> On Thu, Jul 25, 2024 at 05:57:13PM +0000, Patryk Duda wrote:
> > The cros_ec_get_host_command_version_mask() function requires that the
> > caller must have ec_dev->lock mutex before calling it. This requirement
> > was not met and as a result it was possible that two commands were sent
> > to the device at the same time.
>
> To clarify:
> - What would happen if multiple cros_ec_get_host_command_version_mask() calls
> at the same time?
In the best case, MCU will receive both commands glued together and
will ignore them.
It will result in a timeout in the kernel. In the worst case, request
and/or response buffers will be
corrupted.

> - What are the callees? I'm trying to understand the source of parallelism.
This is a race between interrupt handling and ioctl call from userspace

Handling interrupt path
cros_ec_irq_thread()
cros_ec_handle_event()
cros_ec_get_next_event() - Queries host command version without taking
ec_dev->lock mutex first
cros_ec_get_host_command_version_mask()
cros_ec_send_command()
cros_ec_xfer_command()
cros_ec_uart_pkt_xfer()

Command from userspace
cros_ec_chardev_ioctl()
cros_ec_chardev_ioctl_xcmd()
cros_ec_cmd_xfer() - Locks ec_dev->lock mutex before sending command
cros_ec_send_command()
cros_ec_xfer_command()
cros_ec_uart_pkt_xfer()

>
> Also, the patch also needs an unlock at [1].
>
> [1]: https://elixir.bootlin.com/linux/v6.10/source/drivers/platform/chrome/cros_ec_proto.c#L819

Yeah. I'll fix it in v2

>
> > The problem was observed while using UART backend which doesn't use any
> > additional locks, unlike SPI backend which locks the controller until
> > response is received.
>
> Is it a general issue if multiple commands send to EC at a time? If yes, it
> should serialize that in the UART transportation.

Host Commands only support one command at a time. It's enforced by 'lock' mutex
from cros_ec_device structure. We just need to use it properly.