Re: [PATCH v2] kselftest/alsa: Increase kselftest timeout
From: Nícolas F. R. A. Prado
Date: Mon Apr 03 2023 - 17:30:13 EST
On Wed, Dec 14, 2022 at 06:15:22PM +0000, Mark Brown wrote:
> On Wed, Dec 14, 2022 at 09:40:02AM -0700, Shuah Khan wrote:
> > On 12/14/22 06:03, Nícolas F. R. A. Prado wrote:
>
> > > The default timeout for kselftests is 45 seconds, but that isn't enough
> > > time to run pcm-test when there are many PCMs on the device, nor for
> > > mixer-test when slower control buses and fancier CODECs are present.
> > >
> > > As data points, running pcm-test on mt8192-asurada-spherion takes about
> > > 1m15s, and mixer-test on rk3399-gru-kevin takes about 2m.
> > >
> > > Set the timeout to 4 minutes to allow both pcm-test and mixer-test to
> > > run to completion with some slack.
>
> > What I have in mind is that the default run can be limited scope.
> > Run it on a few controllers and in the report mention that a full
> > test can be run as needed.
>
> > There are a couple of examples of default vs. full test runs - cpu
> > and memory hot-lug tests.
>
> For pcm-test it's probably more sensible to refactor things to run
> multiple PCMs (or at least cards, though that's less relevant in an
> embedded context) in parallel rather than cut down the test coverage,
> it's already very limited coverage as things stand. There is some risk
> there could be false positives from cross talk between the PCMs but it's
> probably worth it.
>
> With mixer-test if it's actually taking a long time to run generally
> this is just identifying that the driver could use some work,
> implementing runtime power management and a register cache will probably
> resolve most issues.
Hi Shuah and Mark,
sorry for the delay, but I'd still like to move forward with this.
Shuah, I've checked the tests you mentioned that have limited scope by default,
and we could do the same for the alsa kselftest, but I'm not sure I understand
how this solves the problem here. The fact is that the current timeout is too
short for a full run of the alsa kselftest on some machines, so we need to
increase the timeout in any case regardless of there being a limited scope run
by default or not. Mark implemented the parallelization he mentioned in the
meantime, but it doesn't help every hardware. The only other option I see is
reducing the time the PCM is tested for (currently 4 seconds). But I assume that
number was chosen for a reason.
I'd also like to better understand why we have an arbitrary (45 seconds) default
timeout. If there are users who want to limit the test duration to an arbitrary
time even if that means not getting full test coverage, shouldn't such arbitrary
time be supplied by the users themselves?
And is there any guidance on what are the acceptable exceptions to having a
longer timeout? Because there seem to be many kselftests which override the
default timeout with a longer one, even ones that disable it altogether.
I can see the value of having a timeout as the worst case scenario of how long
the test takes to run, to avoid hanging indefinitely, which is what the tests
with overriden timeout setting do. For the alsa kselftests that's
hardware-dependent (number of kcontrols for mixer-test; and number of PCMs and
working configurations for pcm-test), so we'd either need to guess a high enough
value for the timeout that all known hardware fits, or allow the timeout to be
set dynamically during runtime.
Thanks,
Nícolas