Re: bug-introducing patches (or: -rc cycles suck)

From: Sasha Levin
Date: Tue May 01 2018 - 13:21:38 EST

Next message: Florian Fainelli: "Re: [RFC net-next 0/5] Support for PHY test modes"
Previous message: Michael S. Tsirkin: "Re: [PATCH] vhost: make msg padding explicit"
In reply to: Willy Tarreau: "Re: bug-introducing patches (or: -rc cycles suck)"
Next in thread: Sasha Levin: "Re: bug-introducing patches (or: -rc cycles suck)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, May 01, 2018 at 06:50:51PM +0200, Willy Tarreau wrote:
>On Tue, May 01, 2018 at 04:19:35PM +0000, Sasha Levin wrote:
>> On Mon, Apr 30, 2018 at 09:09:18PM +0200, Willy Tarreau wrote:
>> >Hi Sasha,
>> >
>> >On Mon, Apr 30, 2018 at 05:58:30PM +0000, Sasha Levin wrote:
>> >> - For some reason, the odds of a -rc commit to be targetted for -stable is
>> >> over 20%, while for merge window commits it's about 3%. I can't quite
>> >> explain why that happens, but this would suggest that -rc commits end up
>> >> hurting -stable pretty badly.
>> >
>> >Often, merge window collects work that has been done during the previous
>> >cycle and which is prepared to target this merge window. Fixes that happen
>> >during this period very likely tend to either be remerged with the patches
>> >before they are submitted if they concern the code to be submitted, or are
>> >delayed to after the work gets merged. As a result few of the pre-rc1 patches
>> >get backported while the next ones mostly contain fixes. By the way, you
>> >probably also noticed it when backporting patches to your stable releases,
>> >the mainline commit almost never comes from a merge window.
>>
>> I'm not sure I understand/agree with this explanation. You're saying
>> that commits that fix issues in newly introduced features got folded in
>> the feature before it was sent during the merge window, so then there
>> was no need for them to be tagged for stable?
>
>No, what I mean is that often a developer is either in development mode
>or in bug-fixing mode but it's often quite hard to quickly switch between
>the two. So when you're finishing what you were doing to meet the merge
>window deadline and you receive bug fixes, it's natural to hold on a few
>fixes because it's hard to switch to the review mode. However, if some
>fixes concern the code you're about to submit, it's not bug fixing but
>fixes for your development in progress and that doesn't require as much
>effort, so these updates can often be remerged before being submitted.

I see. But then, wouldn't there be a spike of -stable patches for -rc1
and -rc2?

[snip]
>> It also appears that pretty much the same ratio of commits are tagged
>> for -stable accross all -rc cycles, so there are no spikes at any point
>> during the cycle, which seems to suggest that there is no particular
>> relationship between when a -stable commit is created to the stage in a
>> release cycle of the current kernel.
>
>Not much surprising to me. After all, -rc are "let's observe and fix",
>and it's expected that bugs are randomly met and fixed during that
>period.

So for bugs found and fixed during -rc6+, those fixes should be merged
around the next merge window (time for reviews + time to bake in
stable), but this doesn't seem to happen. Maybe the lack of -stable
commits during merge windows is a symptom of the problem?

>> >I think that you'll also notice that fixes that address bugs introduced
>> >during the merge window of the same version will more often introduce
>> >bugs than the ones which address 6-months old bugs which require some
>> >deeper thinking. In short it indicates that we tend to believe we are
>> >better than we really are, especially very late at night.
>>
>> I very much agree. I also think that "upper-level" maintainers, and
>> Linus in particular have to stop this behavior.
>
>Well it's easier said than done. You don't really choose when you can
>become creative or efficient. For some people it's when everyone else
>is asleep, for others it's when they can have 8 uninterrupted hours
>in front of them to work on a complex bug. I think it's more efficient
>to let people be aware of their limits than to try to overcome them.
>The typical thought "I'm too stupid now, let's go to bed" followed the
>next morning with a review starting to think "what did I break last
>night" is already quite profitable provided people are humble enough
>to think like this.

I'm not saying that patches should be rejected, they should just be told
to spend more time in -next gathering reviews and tests.

Linus would basically say "resend this once the patch has been in -next
for 21 days". That's all.

Heck, we could automate this and check pull requests send to Linus and
warn about "new" patches.

>> Yes, folks who do these
>> patches are often very familiar with the subsystem, but this doesn't
>> mean that they don't make mistakes.
>
>But we all do mistakes all the time. And quite frankly I find that the
>recent kernels quality in the early stages after the release is much
>better than what it used to be. Kernels build fine, boot fine on most
>hardware, and after a few stable versions you can really start to
>forget to update them because you don't meet the crashes anymore. Just
>a simple example (please don't reproduce, I'm not proud of it), when
>I replaced my PC, it came with 4.4.6. I thought "I'll have to upgrade
>next week". But I had so many trouble with its crappy bogus BIOS that
>I was afraid to reboot it. Then I had hundreds of xterms spread over
>multiple displays and it was never the best moment to reboot. Finally
>it happened 550 days later. Yes, the 6th maintenance release of 4.4
>lasted 550 days on a developers machine doing all sort of stuff without
>even a scary message in dmesg. Of course in terms of security it's
>terrible. But we didn't see this level of stability in 2.6.x nor in
>the early 3.x versions.
>
>> It's as if during -rc cycles all rules are void and bug fixes are now
>> no be collected and merged in as fast as humanly possible without any
>> regard to how well these fixes were tested.
>
>These stages are supposed to serve to collect fixes, and fixes are
>supposed to be tested. Often it's worse to let a fix rot somewhere
>than to get it. At the very least by merging it you expose it more
>quickly and you have more chances to know if you missed anything.

Linus's tree isn't a testing tree anymore. In reality, it's just a
cadence/sync point for kernel devs.

Integration and testing happen in -next. The various bots we have run on
-next, most folks are doing their custom testing on -next (we do...).

Linus's tree is was "demoted" as a result of the significant improvement
in testing automation and the capacity to be able to test a fast
changing tree such as -next.

So keeping patches out of Linus's tree isn't really equals to "letting
them rot", it just means "let them get more testing".

>I remember in the past some people arguing that we shouldn't backport
>fixes that haven't experienced a release yet, but that would make the
>situation even worse, with no stable fix for the 3 months following a
>release. The overall amount of reverts in stable kernels remains very
>low, which indicates to me that the overall quality is quite good,
>eventhough the process causes gray hair to the involved people (well
>for those still having hair).

Right, the statistics didn't support the policy change. The -stable
kernel is better at not introducing bugs because (IMO) the commits get
even more reviews.

Countless times my requests for reviews of -stable commits have
uncovered a bug in mainline.

>That's overall why I think that your work can be useful to raise
>awareness of what behaviours decrease the level of quality so that
>everyone can try to improve a bit, but I don't think there is that
>much to squeeze without hitting the wall of what a human brain can
>reasonably deal with. And extra process is a mental pressure just
>like dealing with bugs, so comes a point where process competes
>with quality.

I'm trying to come up with a way where, similar to AUTOSEL, humans won't
need to do much more work.

I'm also not advocating for *more* process, I'm advocating for a
*different* process.

Linus, as he already stated himself, is looking at how long a patch
spent in -next before he pulls it. I'm suggesting to improve and build
on that. Look at how long a patch was in -next, how many people reviewed
it, how much mailing list discussion it triggered, etc.

What I want to end up with is a tool to make maintainer's life easier by
highlighting "dangerous" patches that require more careful review. It's
much more time efficient to keep bugs out than deal with them later.

Next message: Florian Fainelli: "Re: [RFC net-next 0/5] Support for PHY test modes"
Previous message: Michael S. Tsirkin: "Re: [PATCH] vhost: make msg padding explicit"
In reply to: Willy Tarreau: "Re: bug-introducing patches (or: -rc cycles suck)"
Next in thread: Sasha Levin: "Re: bug-introducing patches (or: -rc cycles suck)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]