Re: [PATCH 0/3] x86: make pat and mtrr independent from each other

From: Chuck Zmudzinski
Date: Mon Aug 15 2022 - 12:56:51 EST


Hi Thorsten,

I am forwarding this to you to help you cut through the noise. Unfortunately
the discussion of fixes for this regression has degenerated into ad hominum
attacks. I admit that I started complaining about the response of the
maintainers to this regression and now they are attacking me. I do apologize,
but I do not want to over-apologize. I do not apologize for trying to get
the fix for this regression rolling again. After all, it has been over three months
since the regression was first reported. I don't think I should be accused of
doing anything wrong just for asking for some transparency, honesty, and
a realistic estimate for how long it will take before a fix is committed from the
maintainers responsible for and working on a fix for this regression. I do want
you to provide some feedback here on the public mailing lists.

I present the following message which cuts out the noise and I think describes
fairly completely the problems that are preventing a fix for this regression from
getting merged into the mainline kernel. Can you weigh in with your opinion
about what should be done now?

Best regards,

Chuck

On 8/14/2022 11:23 PM, Chuck Zmudzinski wrote:
> On 8/14/22 4:08 AM, Juergen Gross wrote:
> > > On 8/13/2022 12:56 PM, Chuck Zmudzinski wrote:
> > >
> > > This is a fairly long message but I think what I need to say
> > > here is important for the future success of Linux and open
> > > source software, so here goes....
> > >
> > > Update: I accept Boris Petkov's response to me yesterday as reasonable
> > > and acceptable if within two weeks he at least explains on the public
> > > mailing lists how he and Juergen have privately agreed to fix this regression
> > > "soon" if he does not actually fix the regression by then with a commit,
> > > patch set, or merge. The two-week time frame is from here:
> > >
> > > https://www.kernel.org/doc/html/latest/process/handling-regressions.html
> > >
> > > where developers and maintainers are exhorted as follows: "Try to fix
> > > regressions quickly once the culprit has been identified; fixes for most
> > > regressions should be merged within two weeks, but some need to be
> > > resolved within two or three days."
> >
> > And some more citations from the same document:
> >
> > "Prioritize work on handling regression reports and fixing regression over all
> > other Linux kernel work, unless the latter concerns acute security issues or
> > bugs causing data loss or damage."
> >
> > First thing to note here: "over all Linux kernel work". I' not only working
> > on the kernel, but I have other responsibilities e.g. in the Xen community,
> > where I was sending patches for fixing a regression and where I'm quite busy
> > doing security related work. Apart from that I'm of course responsible to
> > handle SUSE customers' bug reports at a rather high priority. So please stop
> > accusing me to ignore the responses to these patches. This is just not really
> > motivating me to continue interacting with you.
>
> You are busy, and that is always true for someone with your responsibilities.
> That is an acceptable reason to delay your responses for a time.
>
> >
> > "Always consider reverting the culprit commits and reapplying them later
> > together with necessary fixes, as this might be the least dangerous and quickest
> > way to fix a regression."
> >
> > I didn't introduce the regression, nor was it introduced in my area of
> > maintainership. It just happened to hit Xen. So I stepped up after Jan's patches
> > were not deemed to be the way to go, and I wrote the patches in spite of me
> > having other urgent work to do. In case you are feeling so strong about the fix
> > of the regression, why don't you ask for the patch introducing it to be reverted
> > instead?
>
> I have asked for this on more than one occasion, but I was either
> ignored or shot down every time. The fact is, among the persons
> who have the power to actually commit a fix, only you and Boris
> are currently indicating any willingness to actually fix the regression.
> I will say the greater responsibility for this falls on Boris because
> he is an x86 maintainer, and you have every right to walk away
> and say "I will not work on a fix," and I would not blame you or accuse
> you of doing anything wrong if you did that. You are under no obligation
> to fix this. Boris is the one who must fix it, or the Intel developers,
> by reverting the commit that was originally identified as the bad
> commit.
>
> If it is any consolation to you, Juergen, I think the greatest problem
> is the silence of the drm/i915 maintainers, and Thorsten also expressed
> some dissatisfaction because of that, but since there is also some
> consensus that the fix should be done in x86 or x86/pat instead of
> in drm/i915, another problem is the lack of initiative by the x86
> developers to fix it. If they do not know how to fix it and need to
> rely on someone with Xen expertise, they should be giving you
> more assistance and feedback than they currently are. So far, only
> Boris shows any interest, and now my only critique of your behavior
> is that in your message, you chose to engage in an ad hominum attack
> against me instead of taking the same amount of time to at least
> briefly answer the questions Boris raised about your patch set over
> three weeks ago. Your decision to attack me instead of working on
> the fix was, IMHO, not helpful and constructive.
> > Accusing me and Boris is not acceptable at all!
>
> OK, I understand, now we are even. I have said it is unacceptable to
> not give greater priority to the regression fix or at least keep interested
> persons informed if there is a reason to continue to delay a fix, which
> ordinarily should only take two weeks, but now we are at more than
> three months. Now, you are saying it is unacceptable for me to accuse
> you and Boris. OK, so we are even. We each think the other is acting
> in an unacceptable way. I still think it is unacceptable to not work on
> the fix and instead engage in ad hominum attacks. Maybe I am wrong.
> Maybe maintainers are supposed to attack persons who are not
> maintainers when such outsiders try to help and encourage better
> cooperation and end the hostile silence by the maintainers who are
> responsible to fix this. But that does not make sense to me. It makes
> sense to hold accountable those persons who are responsible for fixing
> this (and you, Juergen, are not the one that needs to be held accountable).
> AFAICT, that is not being done and instead I am being attacked for trying
> to get work towards a fix rolling again.
>
> >
> > > I also think there is a private agreement between Juergen and Boris to
> > > fix this regression because AFAICT there is no evidence in the public
> > > mailing lists that such an agreement has been reached, yet Boris yesterday
> > > told me on the public mailing lists in this thread to be "patient" and that
> > > "we will fix this soon." Unless I am missing something, and I hope I am,
> > > the only way that a fix could be coming "soon" would be to presume
> > > that Juergen and Boris have agreed to a fix for the regression in private.
> > >
> > > However, AFAICT, keeping their solution private would be a violation of
> > > netiquette as described here:
> > >
> > > https://people.kernel.org/tglx/notes-about-netiquette
> > >
> > > where a whole section is devoted to the importance of keeping the
> > > discussion of changes to the kernel in public, with private discussions
> > > being a violation of the netiquette that governs the discussions that
> > > take place between persons interested in the Linux kernel project and
> > > other open source projects.
> >
> > Another uncalled for attack.
>
> I am just asking for some transparency and an indication that
> a fix is really and truly in sight. It would only take you a few
> minutes to fulfill what I am asking you to do now. The fact is,
> Boris commented on your patches over three weeks ago and
> asked you if you accepted the approach he outlined and you
> have remained silent. That does not indicate you and Boris
> are close to coming to a fix even though Boris stated that a fix
> is coming soon. Based on what has been said on the mailing
> lists, I just don't see the fix coming soon. That's all I can say
> about it now.
>
> >
> > After sending the patches I just told Boris via IRC that I wouldn't react
> > to any responses soon, as I was about to start my vacation.
>
> That is certainly a valid reason to delay work on this - you were on
> vacation. I hope you enjoyed yourself and had a good time. But I
> had no way of knowing this because I was not part of the IRC
> communication, so I cannot be blamed for not knowing this.
>
> > I will continue with the patches as soon as I find time to do so.
>
> I am willing to wait patiently for you to get back to these patches,
> and I hope you can agree that you should find a few minutes
> to confirm or deny Boris' statement that a fix is coming "soon"
> by posting a public message to this thread within the next two
> weeks, given that this regression has not been fixed for over three
> months. I will not be upset if you say something like: "it looks like
> it might take a while for Boris and I to work out the details of a fix,
> it might take until the end of the year," and briefly explain why there
> will be a delay. Boris might not like that because it would contradict
> his statement that a fix is coming "soon" but I would rather be told
> the truth - that the fix is going to be delayed, than be told a lie - that
> a fix is coming soon.
>
> Thanks for all the work you do.
>
> Best regards,
>
> Chuck