Re: [PATCH AUTOSEL for 4.14 015/161] printk: Add console owner and waiter logic to load balance console writes
From: Sasha Levin
Date: Tue Apr 17 2018 - 10:55:44 EST
On Tue, Apr 17, 2018 at 04:36:31PM +0200, Michal Hocko wrote:
>On Tue 17-04-18 14:04:36, Sasha Levin wrote:
>> On Tue, Apr 17, 2018 at 01:07:17PM +0200, Michal Hocko wrote:
>> >On Tue 17-04-18 12:39:36, Greg KH wrote:
>> >> On Mon, Apr 16, 2018 at 11:28:44PM +0200, Jiri Kosina wrote:
>> >> > On Mon, 16 Apr 2018, Sasha Levin wrote:
>> >> >
>> >> > > I agree that as an enterprise distro taking everything from -stable
>> >> > > isn't the best idea. Ideally you'd want to be close to the first
>> >> > > extreme you've mentioned and only take commits if customers are asking
>> >> > > you to do so.
>> >> > >
>> >> > > I think that the rule we're trying to agree upon is the "It must fix
>> >> > > a real bug that bothers people".
>> >> > >
>> >> > > I think that we can agree that it's impossible to expect every single
>> >> > > Linux user to go on LKML and complain about a bug he encountered, so the
>> >> > > rule quickly becomes "It must fix a real bug that can bother people".
>> >> >
>> >> > So is there a reason why stable couldn't become some hybrid-form union of
>> >> >
>> >> > - really critical issues (data corruption, boot issues, severe security
>> >> > issues) taken from bleeding edge upstream
>> >> > - [reviewed] cherry-picks of functional fixes from major distro kernels
>> >> > (based on that very -stable release), as that's apparently what people
>> >> > are hitting in the real world with that particular kernel
>> >>
>> >> It already is that :)
>> >>
>> >> The problem Sasha is trying to solve here is that for many subsystems,
>> >> maintainers do not mark patches for stable at all.
>> >
>> >The way he is trying to do that is just wrong. Generate a pressure on
>> >those subsystems by referring to bug reports and unhappy users and I am
>> >pretty sure they will try harder... You cannot solve the problem by
>> >bypassing them without having deep understanding of the specific
>> >subsytem. Once you have it, just make sure you are part of the review
>> >process and make sure to mark patches before they are merged.
>>
>> I think we just don't agree on how we should "pressure".
>>
>> Look at the discussion I had with the XFS folks who just don't want to
>> deal with this -stable thing because they have to much work upstream.
>
>So do you really think that you or any script decide without them? My
>recollection from that discussion was quite opposite. Dave was quite
>clear that most of fixes are quite hard to evaluate and most of them
>are simply not worth risking the backport.
No, *some* fixes are hard, not most.
I'm not trying to decide for them, I'm trying to help them decide.
>> There wasn't a single patch in -stable coming from XFS for the past 6+
>> months. I'm aware of more than one way to corrupt an XFS volume for any
>> distro that uses a kernel older than 4.15.
>
>Then try to poke/bribe somebody to have it fixed. But applying
>_something_ is just not a solution. You should also evaluate whether "I
>am able to corrupt" is something that "people see in the wild". Sure
>there are zillions of bugs hidden in the large code base like the
>kernel. People just do not tend to hit them and this will likely not
>change very much in the future.
We can't ignore bugs just because people don't notice.
Data corruption bugs in particular are a pain to report as well, the
corruption might have happened months before and there's not much to
report at that point.
There's quite a few bug classes like that.
>> Sure, please buy them a beer at LSF/MM (I'll pay) and ask them to be
>> better about it, but I don't see this changing.
>
>I can surely have one or two and discuss this. I am pretty sure xfs guys
>are not going to pretend older kernels do not exist.
>
>> The solution to this, in my opinion, is to automate the whole selection
>> and review process. We do selection using AI, and we run every possible
>> test that's relevant to that subsystem.
>>
>> At which point, the amount of work a human needs to do to review a patch
>> shrinks into something far more managable for some maintainers.
>
>I really disagree. I am pretty sure maintainers are very well aware of
>how the patch is important. Some do no care about stable and I agree you
>should poke those. But some have really good reasons to not throw many
>patches that direction because they do not feel the patch is important
>enough.
>
>Remember this is not about numbers. The more is not always better.
So what is "important"? Look at the XFS issues, they were important
enough to get fixed upstream, and have an appropriate test added to
xfstests.
Why didn't they go back to -stable?
>> >> So real bugfixes
>> >> that do hit people are not getting to those kernels, which force the
>> >> distros to do extra work to triage a bug, dig through upstream kernels,
>> >> find and apply the patch.
>> >
>> >I would say that this is the primary role of the distro. To hide the
>> >jungle of the upstream work and provide the additional of bug filtering
>> >and forwarding them the right direction.
>>
>> More often than triaging, you'll just be asked to upgrade to the latest
>> version. What sort of user experience does that provide?
>>
>> [snip]
>>
>> >> So nothing "new" is happening here, EXCEPT we are actually starting to
>> >> get a better kernel-wide coverage for stable fixes, which we have not
>> >> had in the past. That's a good thing! The number of patches applied to
>> >> stable is still a very very very tiny % compared to mainline, so nothing
>> >> new is happening here.
>> >
>> >yes I do agree, the stable process is not very much different from the
>> >past and I would tend both processes broken because they explicitly try
>> >to avoid maintainers which is just wrong.
>>
>> Avoid maintainers?! We send so much "spam" trying to get maintainers
>> more involved in the process. How is that avoiding them?
>
>Just read what your wrote again. I am pretty sure AUTOSEL is on filter
>list on many people. We have a good volume of email traffic already and
>seeing more automatic one just doesn't help. At all!
>
>> If you're a maintainer who has specific requirements for the -stable
>> flow, or you have any automated testing you'd like to be run on these
>> commits, or you want these mails to come in a different format, or
>> pretty much anything else at all just shoot me a mail!
>>
>> It's been almost impossible to get maintainers involved in this process.
>
>The whole stable history was that about not bothering maintainers and
>here is the result.
>
>> We don't sneak anything past maintainers, there are multiple mails over
>> multiple weeks for each commit that would go in. You don't have to
>> review it right away either, just reply with "please don't merge until
>> I'm done reviewing" and it'll get removed from the queue.
>
>I am not talking about sneaking or pushing behind the backs. I am just
>saying that you cannot do this without direct involvement of
>maintainers. If they do not respond to bug reports should at them and I
>am pretty sure that those subsystems will get a bigger pressure to find
>their way to select _important_ fixes to users who are not running the
>bleeding edge because those users _matter_ as well (maybe even more
>because they are a much larger group).
>
>> >> Oh, and if you do want to complain about huge new features being
>> >> backported, look at the mess that Spectre and Meltdown has caused in the
>> >> stable trees. I don't see anyone complaining about those massive
>> >> changes :)
>> >
>> >Are you serious? Are you going the compare the biggest PITA that the
>> >community had to undergo because of HW issues with random pattern
>> >matching in changelog/diffs? Come on!
>>
>> HW Issues are irrelevant here. You had a bug that allowed arbitrary
>> kernel memory access. I can easily list quite a few commits, that are
>> not tagged for stable, that fix exactly the same thing.
>
>Those are important fixes and if you are aware of them then you should
>be involving the respective maintainer. I haven't heard about _any_
>maintainer who would refuse to help.
Let's do it this way: let's assume my AUTOSEL project is bad and I'll
get rid of it tomorrow.
How do I get the XFS folks to send their stuff to -stable? (we have
quite a few customers who use XFS)
How do I get the KVM folks to be more consistent about tagging patches
for -stable? (we support nested KVM!)
How Do I get people who are not aware of how the -stable project to tag
their commits properly? (there's quite a long tail of authors sending 1
important bugfix and disappearing forever)
We can agree that just asking them nicely doesn't work: Greg has been
poking maintainers for years, the -stable project got bunch of
publicity, and the instructions for including a patch in -stable are
pretty straightforward.
You're saying that AUTOSEL doesn't work, so let's ignore that too.
How should we proceed?