Re: chrome-platform/for-kernelci bisection: baseline.bootrr.rockchip-dp-probed on rk3399-gru-kevin

From: Guillaume Tucker
Date: Thu Jul 09 2020 - 07:05:19 EST


On 09/07/2020 10:17, Enric Balletbo i Serra wrote:
> Hi,
>
> On 8/7/20 22:32, Guenter Roeck wrote:
>> On 7/8/20 11:59 AM, kernelci.org bot wrote:
>>> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
>>> * This automated bisection report was sent to you on the basis *
>>> * that you may be involved with the breaking commit it has *
>>> * found. No manual investigation has been done to verify it, *
>>> * and the root cause of the problem may be somewhere else. *
>>> * *
>>> * If you do send a fix, please include this trailer: *
>>> * Reported-by: "kernelci.org bot" <bot@xxxxxxxxxxxx> *
>>> * *
>>> * Hope this helps! *
>>> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
>>>
>>> chrome-platform/for-kernelci bisection: baseline.bootrr.rockchip-dp-probed on rk3399-gru-kevin
>>>
>>> Summary:
>>> Start: 154353417996 KERNELCI: x86_64_defconfig: Enable support for Chromebooks devices
>>> Plain log: https://storage.kernelci.org/chrome-platform/for-kernelci/v5.8-rc1-20-g154353417996/arm64/defconfig/gcc-8/lab-collabora/baseline-rk3399-gru-kevin.txt
>>> HTML log: https://storage.kernelci.org/chrome-platform/for-kernelci/v5.8-rc1-20-g154353417996/arm64/defconfig/gcc-8/lab-collabora/baseline-rk3399-gru-kevin.html
>>> Result: 8c9a6ef40bf4 platform/chrome: cros_ec_proto: Convert EC error codes to Linux error codes
>>>
>>> Checks:
>>> revert: PASS
>>> verify: PASS
>>>
>>> Parameters:
>>> Tree: chrome-platform
>>> URL: https://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux.git
>>> Branch: for-kernelci
>>> Target: rk3399-gru-kevin
>>> CPU arch: arm64
>>> Lab: lab-collabora
>>> Compiler: gcc-8
>>> Config: defconfig
>>> Test case: baseline.bootrr.rockchip-dp-probed
>>>
>>> Breaking commit found:
>>>
>>> -------------------------------------------------------------------------------
>>> commit 8c9a6ef40bf400c64c9907031bd32b59f9d4aea2
>>> Author: Guenter Roeck <linux@xxxxxxxxxxxx>
>>> Date: Sat Jul 4 07:26:07 2020 -0700
>>>
>>> platform/chrome: cros_ec_proto: Convert EC error codes to Linux error codes
>>>
>>> The EC reports a variety of error codes. Most of those, with the exception
>>> of EC_RES_INVALID_VERSION, are converted to -EPROTO. As result, the actual
>>> error code gets lost. Convert all EC errors to Linux error codes to report
>>> a more meaningful error to the caller to aid debugging.
>>>
>>> Cc: Yu-Hsuan Hsu <yuhsuan@xxxxxxxxxxxx>
>>> Cc: Prashant Malani <pmalani@xxxxxxxxxxxx>
>>> Signed-off-by: Guenter Roeck <linux@xxxxxxxxxxxx>
>>> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@xxxxxxxxxxxxx>
>>>
>
> So, as Guenter pointed I dropped this patch now.
>
>>
>> So, just FTR, turns out that there are callers which specifically check for
>> -EPROTO and examine the EC error code if it is returned, or just accept
>> -EPROTO as generic failure (but nothing else). Example is drivers/pwm/pwm-cros-ec.c:
>> cros_ec_num_pwms(). Such commands now fail, in this case because
>> EC_RES_INVALID_PARAM is now returned as -EINVAL and cros_ec_num_pwms()
>> doesn't expect that.
>>
>
> Right, that's interesting, and I'll take in consideration for future reworks of
> the above patch and also take a deeper look at those specific cases reported.

This bisection is probably one of the most interesting ones
indeed. I should mention it when I finally get round to making
a "KernelCI bisections hall of fame" blog post.

> BTW, Guillaume, I queued that patch to give a try and test 3 days ago. Is the
> bisection job expected to take that time to run? In this case I think it also
> took some time to receive the build test, so probably is just a matter of having
> lot of jobs in the queue?
>
> I am not complaining at all, just curious, and just want to know to improve my
> maintainer workflow.

I think what you're doing is perfectly fine, there were some
issues with Jenkins and one build server that caused some
KernelCI builds to not be run this week. Also there is an
intermittent bug in LAVA that causes tests to not run, so I think
the delay was due to an unfortunate combination of infrastructure
issues.

We now have a rather fast build server dedicated to bisections,
so for a maintainer branch like yours where it just takes a
handful of iterations I would expect this kind of report to be
sent at most 6h after a git push. Bisecting linux-next can take
a few extra hours with typically 10~15 iterations.

Best wishes,
Guillaume

>> drivers/iio/common/cros_ec_sensors/cros_ec_sensors.c has a similar problem;
>> it only accepts -EPROTO as "valid" error, but nothing else. I didn't check
>> for others.
>>
>> Guenter
>>