Re: regression in 4.14-rc2 caused by apparmor: add base infastructure for socket mediation

From: Linus Torvalds
Date: Thu Oct 26 2017 - 14:13:17 EST


On Thu, Oct 26, 2017 at 11:11 AM, Thorsten Leemhuis
<regressions@xxxxxxxxxxxxx> wrote:
>
> All that afaics doesn't matter. If a new kernel breaks things for people
> (that especially includes people that do *not* update their userland)
> then it's a kernel regression, even if the root of the problem is in
> usersland. Linus (CCed) said that often enough (I really should sit down
> and collect his mails on this from the web and put them in one
> document).

Thorsten is very much correct.

People should basically always feel like they can update their kernel
and simply not have to worry about it.

I refuse to introduce "you can only update the kernel if you also
update that other program" kind of limitations. If the kernel used to
work for you, the rule is that it continues to work for you.

There have been exceptions, but they are few and far between, and they
generally have some major and fundamental reasons for having happened,
that were basically entirely unavoidable, and people _tried_hard_ to
avoid them. Maybe we can't practically support the hardware any more
after it is decades old and nobody uses it with modern kernels any
more. Maybe there's a serious security issue with how we did things,
and people actually depended on that fundamentally broken model. Maybe
there was some fundamental other breakage that just _had_ to have a
flag day for very core and fundamental reasons.

And notice that this is very much about *breaking* peoples environments.

Behavioral changes happen, and maybe we don't even support some
feature any more. There's a number of fields in /proc/<pid>/stat that
are printed out as zeroes, simply because they don't even *exist* in
the kernel any more, or because showing them was a mistake (typically
an information leak). But the numbers got replaced by zeroes, so that
the code that used to parse the fields still works. The user might not
see everything they used to see, and so behavior is clearly different,
but things still _work_, even if they might no longer show sensitive
(or no longer relevant) information.

But if something actually breaks, then the change must get fixed or
reverted. And it gets fixed in the *kernel*. Not by saying "well, fix
your user space then". It was a kernel change that exposed the
problem, it needs to be the kernel that corrects for it, because we
have a "upgrade in place" model. We don't have a "upgrade with new
user space".

And I seriously will refuse to take code from people who do not
understand and honor this very simple rule.

This rule is also not going to change.

And yes, I realize that the kernel is "special" in this respect. I'm
proud of it.

I have seen, and can point to, lots of projects that go "We need to
break that use case in order to make progress" or "you relied on
undocumented behavior, it sucks to be you" or "there's a better way to
do what you want to do, and you have to change to that new better
way", and I simply don't think that's acceptable outside of very early
alpha releases that have experimental users that know what they signed
up for. The kernel hasn't been in that situation for the last two
decades.

We do API breakage _inside_ the kernel all the time. We will fix
internal problems by saying "you now need to do XYZ", but then it's
about internal kernel API's, and the people who do that then also
obviously have to fix up all the in-kernel users of that API. Nobody
can say "I now broke the API you used, and now _you_ need to fix it
up". Whoever broke something gets to fix it too.

And we simply do not break user space.

Linus