Re: [PATCH v4 1/3] platform/chrome: cros_ec_spi: Move to real time priority for transfers
From: Enric Balletbo i Serra
Date: Tue May 21 2019 - 03:55:23 EST
Hi,
On 15/5/19 19:02, Guenter Roeck wrote:
> On Wed, May 15, 2019 at 9:48 AM Douglas Anderson <dianders@xxxxxxxxxxxx> wrote:
>>
>> In commit 37a186225a0c ("platform/chrome: cros_ec_spi: Transfer
>> messages at high priority") we moved transfers to a high priority
>> workqueue. This helped make them much more reliable.
>>
>> ...but, we still saw failures.
>>
>> We were actually finding ourselves competing for time with dm-crypt
>> which also scheduled work on HIGHPRI workqueues. While we can
>> consider reverting the change that made dm-crypt run its work at
>> HIGHPRI, the argument in commit a1b89132dc4f ("dm crypt: use
>> WQ_HIGHPRI for the IO and crypt workqueues") is somewhat compelling.
>> It does make sense for IO to be scheduled at a priority that's higher
>> than the default user priority. It also turns out that dm-crypt isn't
>> alone in using high priority like this. loop_prepare_queue() does
>> something similar for loopback devices.
>>
>> Looking in more detail, it can be seen that the high priority
>> workqueue isn't actually that high of a priority. It runs at MIN_NICE
>> which is _fairly_ high priority but still below all real time
>> priority.
>>
>> Should we move cros_ec_spi to real time priority to fix our problems,
>> or is this just escalating a priority war? I'll argue here that
>> cros_ec_spi _does_ belong at real time priority. Specifically
>> cros_ec_spi actually needs to run quickly for correctness. As I
>> understand this is exactly what real time priority is for.
>>
>> There currently doesn't appear to be any way to use the standard
>> workqueue APIs with a real time priority, so we'll switch over to
>> using using a kthread worker. We'll match the priority that the SPI
>> core uses when it wants to do things on a realtime thread and just use
>> "MAX_RT_PRIO - 1".
>>
>> This commit plus the patch ("platform/chrome: cros_ec_spi: Request the
>> SPI thread be realtime") are enough to get communications very close
>> to 100% reliable (the only known problem left is when serial console
>> is turned on, which isn't something that happens in shipping devices).
>> Specifically this test case now passes (tested on rk3288-veyron-jerry):
>>
>> dd if=/dev/zero of=/var/log/foo.txt bs=4M count=512&
>> while true; do
>> ectool version > /dev/null;
>> done
>>
>> It should be noted that "/var/log" is encrypted (and goes through
>> dm-crypt) and also passes through a loopback device.
>>
>> Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
>
> Reviewed-by: Guenter Roeck <groeck@xxxxxxxxxxxx>
>
Added to the for-next branch for the autobuilders to play with, if all goes well
will be queued in chrome-platform-5.3
Thanks,
Enric