Re: bug-introducing patches (or: -rc cycles suck)
From: Sasha Levin
Date: Tue May 01 2018 - 12:19:44 EST
On Mon, Apr 30, 2018 at 09:09:18PM +0200, Willy Tarreau wrote:
>Hi Sasha,
>
>On Mon, Apr 30, 2018 at 05:58:30PM +0000, Sasha Levin wrote:
>> - For some reason, the odds of a -rc commit to be targetted for -stable is
>> over 20%, while for merge window commits it's about 3%. I can't quite
>> explain why that happens, but this would suggest that -rc commits end up
>> hurting -stable pretty badly.
>
>Often, merge window collects work that has been done during the previous
>cycle and which is prepared to target this merge window. Fixes that happen
>during this period very likely tend to either be remerged with the patches
>before they are submitted if they concern the code to be submitted, or are
>delayed to after the work gets merged. As a result few of the pre-rc1 patches
>get backported while the next ones mostly contain fixes. By the way, you
>probably also noticed it when backporting patches to your stable releases,
>the mainline commit almost never comes from a merge window.
I'm not sure I understand/agree with this explanation. You're saying
that commits that fix issues in newly introduced features got folded in
the feature before it was sent during the merge window, so then there
was no need for them to be tagged for stable?
This would be also true for -rc cycle patches if they fix a commit that
was introduced in that merge window: patches that fix a feature that got
in that same merge window don't need to be tagged for stable either
since the feature didn't exist in a previous release.
The way I see it is that -stable commits fix a bug that was introduced
in a feature that exists in a kernel that was already released. At that
point, the fix can come in at any point in time, whether the fix was
created during the merge window, or during an -rc cycle.
It also appears that pretty much the same ratio of commits are tagged
for -stable accross all -rc cycles, so there are no spikes at any point
during the cycle, which seems to suggest that there is no particular
relationship between when a -stable commit is created to the stage in a
release cycle of the current kernel.
>> 2. Maintainers need to stop writing patches, commiting them, and pushing them
>> in without reviews. In -rc cycles there is quite a large number of commits
>> that were either written by maintainers, commited, and merged upstream the
>> same day. These patches are very likely to introduce a new bug.
>
>Developers are humans before anything else. We probably all address most
>bug reports the same way : "ah, of course, stupid me, now that's fixed".
>Keep in mind that for the developer, the pressure has lowered now that
>the code got merged, and that mentally the fix is "on top" of the initial
>work and no more part of it. It often means a narrower mental image of
>how the fix fits in the whole code.
>
>I think that you'll also notice that fixes that address bugs introduced
>during the merge window of the same version will more often introduce
>bugs than the ones which address 6-months old bugs which require some
>deeper thinking. In short it indicates that we tend to believe we are
>better than we really are, especially very late at night.
I very much agree. I also think that "upper-level" maintainers, and
Linus in particular have to stop this behavior. Yes, folks who do these
patches are often very familiar with the subsystem, but this doesn't
mean that they don't make mistakes.
It's as if during -rc cycles all rules are void and bug fixes are now
no be collected and merged in as fast as humanly possible without any
regard to how well these fixes were tested.
With merge window stuff, Linus will make lots of noise if commits didn't
spend any time in -next (see https://lkml.org/lkml/2017/2/23/611) for
example. But it seems that -rc commits don't have that requirement.
>> I don't really have a proposal beyond "tighten up -rc cycles", but I think
>> it's a discussion worth having. We have enough data to show what parts of
>> kernel development work, and what parts are just hurting us.
>
>I'm inclined to believe that making individuals aware of their own
>mistakes can help. I personally like to try to understand how I managed
>to introduce a bug, it's always useful. Very often it's around "I was
>pretty sure it didn't require testing, the change was so obvious". We
>all know this feeling when you write 100 lines in a new file, you
>compile, and it builds without any warning and apparently works, and
>suddenly you think "uh oh, what did I do wrong?" and you have no idea
>where to start to look for possible mistakes.
>
>Probably that some statistics on mistake classifications and maybe some
>affected subsystems (if that doesn't blame anyone) could be useful.