Re: seccomp ptrace selftest failures with 4.4-stable [Was: Re: LTS testing with latest kselftests - some failures]

From: Shuah Khan
Date: Thu Jun 22 2017 - 16:24:06 EST


Hi Tom,

On 06/22/2017 01:48 PM, Tom Gall wrote:
> Hi
>
> On Thu, Jun 22, 2017 at 2:06 PM, Shuah Khan <shuah@xxxxxxxxxx> wrote:
>> On 06/22/2017 11:50 AM, Kees Cook wrote:
>>> On Thu, Jun 22, 2017 at 10:49 AM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>>>> On Thu, Jun 22, 2017 at 10:09 AM, Shuah Khan <shuah@xxxxxxxxxx> wrote:
>>>>> On 06/22/2017 10:53 AM, Kees Cook wrote:
>>>>>> On Thu, Jun 22, 2017 at 9:18 AM, Sumit Semwal <sumit.semwal@xxxxxxxxxx> wrote:
>>>>>>> Hi Kees, Andy,
>>>>>>>
>>>>>>> On 15 June 2017 at 23:26, Sumit Semwal <sumit.semwal@xxxxxxxxxx> wrote:
>>>>>>>> 3. 'seccomp ptrace hole closure' patches got added in 4.7 [3] -
>>>>>>>> feature and test together.
>>>>>>>> - This one also seems like a security hole being closed, and the
>>>>>>>> 'feature' could be a candidate for stable backports, but Arnd tried
>>>>>>>> that, and it was quite non-trivial. So perhaps we'll need some help
>>>>>>>> from the subsystem developers here.
>>>>>>>
>>>>>>> Could you please help us sort this out? Our goal is to help Greg with
>>>>>>> testing stable kernels, and currently the seccomp tests fail due to
>>>>>>> missing feature (seccomp ptrace hole closure) getting tested via
>>>>>>> latest kselftest.
>>>>>>>
>>>>>>> If you feel the feature isn't a stable candidate, then could you
>>>>>>> please help make the test degrade gracefully in its absence?
>>
>> In some cases, it is not easy to degrade and/or check for a feature.
>> Probably several security features could fall in this bucket.
>>
>>>>>>
>>>>>> I don't really want to have that change be a backport -- it's quite
>>>>>> invasive across multiple architectures.
>>
>> Agreed. The same test for kernel applies to tests as well. If a kernel
>> feature can't be back-ported, the test for that feature will fall in the
>> same bucket. It shouldn't be back-ported.
>>
>>>>>>
>>>>>> I would say just add a kernel version check to the test. This is
>>>>>> probably not the only selftest that will need such things. :)
>>>>>
>>>>> Adding release checks to selftests is going to problematic for maintenance.
>>>>> Tests should fail gracefully if feature isn't supported in older kernels.
>>>>>
>>>>> Several tests do that now and please find a way to check for dependencies
>>>>> and feature availability and fail the test gracefully. If there is a test
>>>>> that can't do that for some reason, we can discuss it, but as a general
>>>>> rule, I don't want to see kselftest patches that check release.
>>>>
>>>> If a future kernel inadvertently loses the new feature and degrades to
>>>> the behavior of old kernels, that would be a serious bug and should be
>>>> caught.
>>
>> Agreed. If I understand you correctly, by not testing stable kernels
>> with their own selftests, some serious bugs could go undetected.
>
> Personally I'm a bit skeptical. I think the reasoning is more that the
> latest selftests provide more coverage, and therefore should be better
> tests, even on older kernels.

The assumption that "the latest selftests provide more coverage, and
therefore should be better tests, even on older kernels." is incorrect.

Selftests in general track the kernel features. In some cases, new
tests could be added that provide better coverage on older kernels,
however, it is more likely that new tests are added to test new kernel
features and enhancements to existing features. Based on the second
"enhancements to existing features" it is more important to test newer
kernels with older selftests. This does happen in kernel integration
cycles during development.

As a general rule, testing stable kernels with their own selftests will
yield the best results.

>
>>>
>>> Right. I really think stable kernels should be tested with their own
>>> selftests. If some test is needed in a stable kernel it should be
>>> backported to that stable kernel.
>>
>> Correct. This is always a safe option. There might be cases that even
>> prevent tests being built, especially if a new feature adds new fields
>> to an existing structure.
>>
>> It appears in some cases, users want to run newer tests on older kernels.
>> Some tests can clearly detect feature support using module presence and/or
>> Kconfig enabled or disabled. These are conditions even on a kernel that
>> supports a new module or new config option. The kernel the test is running
>> on might not have the feature enabled or module might not be present. In
>> these cases, it would be easier to detect and skip the test.
>>
>> However, some features aren't so easy. For example:
>>
>> - a new flag is added to a syscall, and new test is added. It might not
>> be easy to detect that.
>> - We might have some tests that can't detect and skip.
>>
>> Based on this discussion, it is probably accurate to say:
>>
>> 1. It is recommended that selftests from the same release be run on the
>> kernel.
>> 2. Selftests from newer kernels will run on older kernels, user should
>> understand the risks such as some tests might fail and might not
>> detect feature degradation related bugs.
>> 3. Selftests will fail gracefully on older releases if at all possible.
>
> How about gracefully be skipped instead of fail?

Yes. That is the goal and that is what tests do. Tests do detect
dependencies on features, modules, config options and decide to skip
the test. If a test doesn't do that, it gets fixed.

>
> The later suggests the test case in some situations can detect it's
> pointless to run something and say as much instead of emitting a
> failure that would be a waste of time to look into.

Right. Please see above. However, correctly detecting dependencies
isn't possible in all cases. In some cases, fail is what it can do.

>
> As another example take tools/testing/selftests/net/psock_fanout.c
> On 4.9 it'll fail to compile (using master's selftests) because
> PACKET_FANOUT_FLAG_UNIQUEID isn't defined. Add a simple #ifdef for
> that symbol and the psock_fanout test will compile and run just fine.
>
>> Sumit!
>>
>> 1. What are the reasons for testing older kernel with selftests from
>> newer kernels? What are the benefits you see for doing so?
>
> I think the presumption is the latest greatest collection of selftests
> are the best, most complete.

Not necessarily the case.

>
>> I am looking to understand the need/reasons for this use-case. In our
>> previous discussion on this subject, I did say, you should be able to
>> do so with some exceptions.
>>
>> 2. Do you test kernels with the selftests from the same release?
>
> We have the ability to do either. The new shiny .... it calls.

If the only reason is "shiny", I would say you might not be getting
the best results possible.

>
>> 3. Do you find testing with newer selftests to be useful?
>
> I think it comes down to coverage and again the current perception
> that latest greatest is better. Quantitatively we haven't collected
> data to support that position tho it would be interesting to compare
> say a 4.4-lts and it's selftests directory to a mainline, see how much
> was new and then find out how much of those new selftests actually
> work on the older 4.4-lts.
>

As I explained above, The assumption/perception that "the latest selftests
provide more coverage, and therefore should be better tests, even on older
kernels." is incorrect.

As per collecting data to see if testing newer selftests provide better
coverage or not might or might not be worth while exercise. Some releases
might include tests for existing features and some might not. The mix might
be different. As a general rule "selftests are intended to track and do track
features in their release" is a good assumption.

It might be useful to fix tests from newer releases so they "never fail" on
older releases might not give us the best ROI as whole. These need to be
evaluated case by case basis.

I would recommend the following approach based on this discussion and now
that we understand incorrect assumption and/or mis-perception to be the
basis for choosing to test stable kernels with selftests from new releases.

1. Testing stable kernels with their own selftests will yield the best
results.
2. Testing stable kernels with newer selftests could be done if user finds
that it provides better coverage, knowing that there is no guarantee that
it will.

thanks,
-- Shuah