Re: [ANNOUNCE] linux-stable security tree
From: Sasha Levin
Date: Mon Apr 11 2016 - 18:49:08 EST
On 04/11/2016 05:17 PM, Willy Tarreau wrote:
> Hi Sasha,
>
> On Mon, Apr 11, 2016 at 04:38:17PM -0400, Sasha Levin wrote:
>>> How are you
>>> going to judge which driver fixes to take and which not to? Why not
>>> take them all if they fix bugs?
>>
>> Because some fixes introduce bug on their own? Take a look at how many
>> commits in the stable tree have a "Fixes:" tag that points to a commit
>> that's also in the stable tree.
>
> I'm using stable trees myself in the load balancing products we ship at
> work. I've met a single bug during the whole 3.10 lifetime and it was
> caused by one of our out-of-tree patch that applied at the wrong place
> after an update. I'd generally say that -stable quality is very good,
> if not excellent. Several people review the patches before they get
> merged, several ones build and even boot them. It's not that random.
> Look, one patch was just dropped from 3.14.64 because it failed a build
> test in one environment. This one will never hit end users.
Bugs in the stable tree are created not only because of maintainer errors or
lack of reviews, but also because fixes themselves have bugs.
While maintaining 3.18, I've noticed that I ended up missing one commit because
it's "Fixes:" field had a commit hash that wasn't in the tree, only to find later
that that commit was pulled in in a previous stable tree and had a different hash
in the local tree.
I've since automated that lookup, and that case gets hit quite often.
As this suggests, we don't only fix bugs in stable trees, we also introduce
a fair amount of new ones.
>> Look at the opposite side of this question: why would anyone take a commit
>> that fixes a bug he doesn't care about? Are the benefits really worth it
>> considering the risks?
>
> That's exactly what most people do. I don't update to each and every kernel.
> When I see xen, lvm, drm and audio changes I don't need them in my products.
> But when I'm seeing network fixes I study them and often decide that it's
> worth upgrading. Sometimes I pick a single fix from the queue because I can't
> wait for next release. Many of Greg's kernels more or less focus on certain
> topics, probably due to the way he deals with his mailbox and patch storms,
> so it's often easy to quickly decide if you're going to need to update or
> not.
I fully agree with you here, if you can hand pick the commits you want and
study every commit that goes in I can't really argue with that.
But the target audience here is different. I'm thinking about projects where
the kernel isn't even close to being center stage, and it doesn't interest
anyone as long as things work. There is no chance in hell that they would
start picking commits unless you hold a gun to their hand.
My thinking here is that it's easier to convince them to take a handful of
patches in rather than hundreds.
Also, I'm not trying to replace the stable tree. I'm just trying to present
a stopgap solution that projects can use in between complete stable kernel
updates (or changes in kernel version). This is one of the reasons I'm not
adding any new commits in here that aren't in the stable tree already.
>> [snip]
>>
>>>>> Define "important". Now go and look at the tty bug we fixed that people
>>>>> only realized was "important" 1 1/2 years later and explain if you
>>>>> would, or would not have, taken that patch in this tree.
>>>>
>>>> Probably not, but I would have taken it after it received a CVE number.
>>>>
>>>> Same applies to quite a few commits that end up in stable - no one thinks
>>>> they're stable material at first until someone points out it's crashing
>>>> his production boxes for the past few months.
>>>
>>> Yes, but those are rare, what you are doing here is suddenly having to
>>> judge if a bug is a "security" issue or not. You are now in the
>>> position of trying to determine "can this be exploited or not", for
>>> every commit, and that's a very hard call, as is seen by this specific
>>> issue.
>
> Especially for networking stuff or things related to local resource usage
> where some people consider it represents a local DoS risk and others
> consider that it's just irrelevant to their servers since they have no
> local users.
Right. I try to pick those, but agree that the whole "security commit" term
is (very) vaguely defined.
>> The stable stuff isn't rare as you might think, even more: the amount of
>> actual CVE fixes that are not in the stable tree might surprise you.
>
> I would personally not be surprized since Ben used to feed me with a lot
> of fixes I had never seen previously. What is unclear to me is if your
> tree will contain only a selection of patches that are already in the
> respective branches, or a backport of security fixes that we can pick
> from to feed our stable branches and limit the risk of missing them.
> *This* actually could be useful to everyone, starting from our users.
The plan is to have only a subset of the stable tree, you won't see any
new commits here.
I agree it would be useful to have a better way of identifying security
commits and making sure that they all get applied, but unfortunately I'm
in the same boat as the rest of the maintainers and don't have a surefire
way of getting it done.
Maybe we can spin off a lts-security@ mailing list just for patches, and
make sure that every patch we need to apply gets sent there by someone
on that list?
> (...)
>> This is actually what happens now; projects get to the point they don't
>> want to update their whole kernel tree anymore so that just freezes because
>> they don't want to re-validate the whole thing over and over, but they
>> still cherry pick upstream and out-of-tree commits that they care about.
>>
>> If they added a handful of security commits to cherry pick and carefully
>> review their security will be much better than what happens now.
>
> Actually I do think that for end users it's a regression. People will
> start reusing outdated kernels which only contain the most critical fixes
> known, but will still suffer from memory leaks, deadlocks, kernel panics,
> data corruption etc. Every single bug that doesn't have a CVE attached to
> it in fact, which means 99% of the bugs that bring a system down in
> production. It makes me think about people who only pick security fixes
> from openssl and not the regular batch of missing null checks, and who
> complain all the time that their systems are unstable while they simply
> don't apply fixes.
Right, but once the system is up and running it's hard to convince someone
to take in ~100 commits that may or may not fix or introduce a bug.
If they are capable of cherry picking exactly the commits they want, then
it's perfect. Otherwise, the kernel doesn't go anywhere.
> You see, when I started with the "hotfix" tree 11 years ago for kernel
> 2.4, I intented to only pick the most critical fixes, they would fit
> in just a README and were counted on one hand. One year later there
> were 150 just because everything becomes critical for *some* workloads.
>
> I *do* think that having a central reference for fixes that come with
> a reproducer (hence many security fixes) can be useful as it would
> offer an opportunity for better testing backports when they become
> tricky : it often takes much more time to try to set up a test with
> a reproducer than it takes to backport and adjust the fix (not always
> true). But when it comes to security issues often the reporter cares
> about the quality of the backport and helps there.
Agreed, but I'm not sure how we can pull it off.
Thanks,
Sasha