Re: RFC - kernel selftest result documentation (KTAP)

From: Frank Rowand
Date: Fri Jun 19 2020 - 14:33:24 EST


On 2020-06-16 11:42, Bird, Tim wrote:
>
>
>> -----Original Message-----
>> From: Paolo Bonzini <pbonzini@xxxxxxxxxx>
>>
>> On 15/06/20 21:07, Bird, Tim wrote:
>>>> Note: making the plan line required differs from TAP13 and TAP14. I
>>>> think it's the right choice, but we should be clear.
>>
>> As an aside, where is TAP14?
> By TAP14, I was referring to the current, undocumented, KUnit
> conventions.

When "TAP14" is used in this discussion, let's please use the current
proposed TAP14 spec which Brendan has provided a link to. If you
want to describe current KUnit conventions, please say current
KUnit conventions.

>
>>
>>> With regards to making it optional or not, I don't have a strong
>>> preference. The extra info seems helpful in some circumstances.
>>> I don't know if it's too onerous to make it a requirement or not.
>>> I'd prefer if it was always there (either at the beginning or the end),
>>> but if there is some situation where it's quite difficult to calculate,
>>> then it would be best not to mandate it. I can't think of any impossible
>>> situations at the moment.
>>
>> I think making the plan mandatory is a good idea. "Late plans" work
>> very well for cases where you cannot know in advance the number of tests
>> (for example in filters that produce TAP from other output), and provide
>> an additional safety net.
>>
>>>> "Bail out!" to be moved to "optional" elements, since it may not appear.
>>>> And we should clarify TAP13 and TAP14's language to say it should only
>>>> appear when the test is aborting without running later tests -- for this
>>>> reason, I think the optional "description" following "Bail out!" should
>>>> be made required. I.e. it must be: "Bail out! $reason"
>>>
>>> I'll make sure this is listed as optional.
>>> I like adding a mandatory reason.
>>
>> +1.
>>
>>>> TAP13/14 makes description optional, are we making it required (I think
>>>> we should). There seems to be a TAP13/14 "convention" of starting
>>>> <description> with "- ", which I'm on the fence about it. It does make
>>>> parsing maybe a little easier.
>>>
>>> I would like the description to be required.
>>> I don't have a strong opinion on the dash. I'm OK with either one (dash
>>> or no dash), but we should make kselftest and KUnit consistent.
>>
>> I think no mandatory dash is better (or even mandatory no-dash!). We
>> can suggest removing it when formatting TAP output.
>
> My personal preference is to have the dash. I think it's more human readable.
> I note that the TAP spec has examples of result lines both with and without
> the dash, so even the spec is ambiguous on this. I think not mandating it
> either way is probably best. For regex parsers, it's easy to ignore with '[-]?'
> outside the pattern groups that grab the number and description.
>
>>
>>>>> Finally, it is possible to use a test directive to indicate another
>>>>> possible outcome for a test: that it was skipped. To report that
>>>>> a test case was skipped, the result line should start with the
>>>>> result "not ok", and the directive "# SKIP" should be placed after
>>>>> the test description. (Note that this deviates from the TAP13
>>>>> specification).
>>
>> How so? The description comes first, but there can be a description of
>> the directive.
> None of the examples of skips in the TAP13 spec have a test descriptions before
> the '# SKIP' directive. But maybe I read too much into the examples. There is a
> format example, and a list of items in a result line that both have the test description
> before the directive. So maybe I read this wrong.

Yes, I think you read too much into the examples. I think the TAP spec
is very hard to read in the current form (v13 and proposed v14). If
we create a KTAP spec, I can do editing to clean up some of the issues
or give review comments on what issues about clarity that I see and how
to fix them.

I read the spec as saying that the description is optional, but if the
description exists in a test line that contains "# TODO explanation"
then the description will precede the "#".

(Yes, you are seeing "I volunteer" because the current spec is so
frustrating to me.)

>
>>
>> not ok 4 - Summarized correctly # TODO Not written yet
>>
>>>>> It is usually helpful if a diagnostic message is emitted to explain
>>>>> the reasons for the skip. If the message is a single line and is
>>>>> short, the diagnostic message may be placed after the '# SKIP'
>>>>> directive on the same line as the test result. However, if it is
>>>>> not on the test result line, it should precede the test line (see
>>>>> diagnostic data, next).
>>>>>
>>>>> Bail out!
>>>>> ---------
>>>>> If a line in the test output starts with 'Bail out!', it indicates
>>>>> that the test was aborted for some reason. It indicates that
>>>>> the test is unable to proceed, and no additional tests will be
>>>>> performed.
>>>>>
>>>>> This can be used at the very beginning of a test, or anywhere in the
>>>>> middle of the test, to indicate that the test can not continue.
>>>>
>>>> I think the required syntax should be:
>>>>
>>>> Bail out! <reason>
>>>>
>>>> And to make it clear that this is optionally used to indicate an early
>>>> abort. (Though with a leading plan line, a parser should be able to
>>>> determine this on its own.)
>>
>> True. However, "Bail out!" allow to distinguish issues with the harness
>> (such as ENOSPC) from test aborts.
>>
>>>>> - TODO directive
>>>>
>>>> Agreed: SKIP should cover everything TODO does.
>>
>> XFAIL/XPASS are different from SKIP. I personally don't have a need for
>> them, but kselftests includes XFAIL/XPASS exit codes and they aren't
>> reflected into selftests/kselftest/runner.sh.
>>
>> Likewise, kselftest.h has ksft_inc_xfail_cnt but not
>> ksft_test_result_xfail/ksft_test_result_xpass.
>>
>> It's important to notice in the spec that the TODO directive inverts the
>> direction of ok/not-ok (i.e. XFAIL, the "good" result, is represented by
>> "not ok # TODO").
>
> The TAP13 spec is not explicit about the result for TODO (and only provides
> one example), but the text *does* say a TODO can represent a bug to be fixed.
> This makes it the equivalent of XFAIL. I hadn't noticed this before. Thanks.

TAP 13 spec:

"Note that if the TODO has an explanation it must be separated from TODO by a
space. These tests represent a feature to be implemented or a bug to be fixed
and act as something of an executable âthings to doâ list. They are not expected
to succeed. Should a todo test point begin succeeding, the harness should report
it as a bonus. This indicates that whatever you were supposed to do has been done
and you should promote this to a normal test point."

That seems very clear and explicit to me about what the intent and usage of TODO is.


>
>>
>>>>> - test identifier
>>>>> - multiple parts, separated by ':'

I don't find "identifier" when I grep for it in KUnit related places. And
it has been a while since I've read through the KUnit code. So I am a
little lost about what a "test identifier" is. Is this part of the test
line "Description"? Any pointers to what this is?

-Frank


>>>>
>>>> This is interesting... is the goal to be able to report test status over
>>>> time by name?
>>
>> What about "/" instead?
> In my experience, / is used in a lot of test descriptions (when quoting
> file paths) (not in kselftest, but in lots of other tests). Both Fuego
> and KernelCI use colons, and that's what kselftest already uses,
> so it seems like a good choice.
>
>>
>>>>> Finally,
>>>>> - Should a SKIP result be 'ok' (TAP13 spec) or 'not ok' (current kselftest practice)?
>>>>> See https://testanything.org/tap-version-13-specification.html
>>>>
>>>> Oh! I totally missed this. Uhm. I think "not ok" makes sense to me "it
>>>> did not run successfully". ... but ... Uhhh ... how do XFAIL and SKIP
>>>> relate? Neither SKIP nor XFAIL count toward failure, though, so both
>>>> should be "ok"? I guess we should change it to "ok".
>>
>> See above for XFAIL.
>>
>> I initially raised the issue with "SKIP" because I have a lot of tests
>> that depend on hardware availability---for example, a test that does not
>> run on some processor kinds (e.g. on AMD, or old Intel)---and for those
>> SKIP should be considered a success.
>>
>> Paolo
>>
>>> I have the same initial impression. In my mind, a skip is "not ok", since
>>> the test didn't run. However, a SKIP and should be treated differently
>>> from either "ok" or "not ok" by the results interpreter, so I don't think it
>>> matters. Originally I was averse to changing the SKIP result to "ok"
>>> (as suggested by Paulo Bonzini [1]), but I checked and it's pretty trivial to
>>> change the parser in Fuego, and it would make the kernel results format
>>> match the TAP13 spec. I don't see a strong reason for us to be different
>>> from TAP13 here.
>>>
>>> I raised this issue on our automated testing conference call last week
>>> (which includes people from the CKI, Fuego, KernelCI and LKFT projects), and
>>> so people should be chiming in if their parser will have a problem with this change.)
>>>
>>> [1] https://lkml.kernel.org/lkml/20200610154447.15826-1-pbonzini@xxxxxxxxxx/T/
>>>
>>> Thanks very much for the feedback.
>>> -- Tim
>>>
>