Re: [ANNOUNCE] linux-stable security tree

From: Willy Tarreau
Date: Tue Apr 12 2016 - 02:22:51 EST


On Mon, Apr 11, 2016 at 06:48:32PM -0400, Sasha Levin wrote:
> As this suggests, we don't only fix bugs in stable trees, we also introduce
> a fair amount of new ones.

Sure, and the older the kernel, the higher the number of bugs reintroduced
due to a faulty backport (often relying on a missing patch). I tell every
possible user that the quality improves over time until a point where more
bugs get added than fixed, which is where it's time to stop. I almost reached
that point in 2.4 and I started to replace fixes by documentation for existing
bugs that would not be fixed :-/

> >> Look at the opposite side of this question: why would anyone take a commit
> >> that fixes a bug he doesn't care about? Are the benefits really worth it
> >> considering the risks?
> >
> > That's exactly what most people do. I don't update to each and every kernel.
> > When I see xen, lvm, drm and audio changes I don't need them in my products.
> > But when I'm seeing network fixes I study them and often decide that it's
> > worth upgrading. Sometimes I pick a single fix from the queue because I can't
> > wait for next release. Many of Greg's kernels more or less focus on certain
> > topics, probably due to the way he deals with his mailbox and patch storms,
> > so it's often easy to quickly decide if you're going to need to update or
> > not.
>
> I fully agree with you here, if you can hand pick the commits you want and
> study every commit that goes in I can't really argue with that.

I really don't study them, I look at the list and say "OK nothing interesting
here, let's skip it". I only pick a fix when I have recently been notified of
a bug that I want fixed in our products.

> But the target audience here is different. I'm thinking about projects where
> the kernel isn't even close to being center stage, and it doesn't interest
> anyone as long as things work. There is no chance in hell that they would
> start picking commits unless you hold a gun to their hand.

Definitely, I agree. But the problem I'm having is that often they don't
care at all about bugs, they *believe* that security bugs are the only
ones to care about. You can force them to upgrade by reminding them that
the kernel they're running is full of CVE that need to be fixed. Once they
have the option to fix CVEs only and ignore real bugs, you won't be able
to force them to upgrade to get their bugs fixed.

> My thinking here is that it's easier to convince them to take a handful of
> patches in rather than hundreds.

I tend to think that the term "security" in the tree's name is misleading
the users into thinking that what they find there is enough for them. Most
incompetent admins think they need only "security" fixes. Maybe you should
use the term "minimal" or something like this to indicate that this is the
smallest set of fixes that should absolutely be running on any system.
Because quite frankly a filesystem corruption has much more impact on an
NFS server than a local info leak (which is often tagged as security).

> Right. I try to pick those, but agree that the whole "security commit" term
> is (very) vaguely defined.

Absolutely. For me, a bug becomes a security bug if it impacts the system's
stability, integrity or confidentiality in a way that can be triggered by a
non-authorized user more often than what would otherwise happen. Thus it
solely depends on the surrounding environment. For example in our products
we don't use DNS at all but regardless we fixed the recent glibc getaddrinfo
bug because it took less energy to apply the fix than to explain to customers
why they weren't at risk.

> The plan is to have only a subset of the stable tree, you won't see any
> new commits here.

OK.

> I agree it would be useful to have a better way of identifying security
> commits and making sure that they all get applied, but unfortunately I'm
> in the same boat as the rest of the maintainers and don't have a surefire
> way of getting it done.
>
> Maybe we can spin off a lts-security@ mailing list just for patches, and
> make sure that every patch we need to apply gets sent there by someone
> on that list?

I'm not sure it would be that useful. I guess that if we had a central
list somewhere listing the upstream commits that should be tagged as
such in an easily searchable way (ie by scripts), it would be more useful.
We could even amend the commit messages to mention the CVE id once it's
assigned, and sometimes "relies on this preliminary patch". That's a
tough work though, I don't know who would be interested in working on
this.

> > (...)
> >> This is actually what happens now; projects get to the point they don't
> >> want to update their whole kernel tree anymore so that just freezes because
> >> they don't want to re-validate the whole thing over and over, but they
> >> still cherry pick upstream and out-of-tree commits that they care about.
> >>
> >> If they added a handful of security commits to cherry pick and carefully
> >> review their security will be much better than what happens now.
> >
> > Actually I do think that for end users it's a regression. People will
> > start reusing outdated kernels which only contain the most critical fixes
> > known, but will still suffer from memory leaks, deadlocks, kernel panics,
> > data corruption etc. Every single bug that doesn't have a CVE attached to
> > it in fact, which means 99% of the bugs that bring a system down in
> > production. It makes me think about people who only pick security fixes
> > from openssl and not the regular batch of missing null checks, and who
> > complain all the time that their systems are unstable while they simply
> > don't apply fixes.
>
> Right, but once the system is up and running it's hard to convince someone
> to take in ~100 commits that may or may not fix or introduce a bug.

The problem works two ways :
- if a subsystem is not relevant to some users, they won't be interested
by a fix for this subsystem, just like they won't be affected by a risk
of regression ;

- if this subsystem is relevant to them, sure they risk a regression but
they also need to apply fixes otherwise they *know* they're running
bogus code.

I think we may have more of an issue educating our users than an issue
with the code we distribute.

Willy