Re: [PATCH 4.4 095/108] Bluetooth: btusb: Restore QCA Rome suspend/resume fix with a "rewritten" version

From: Brian Norris
Date: Wed Feb 28 2018 - 14:40:10 EST


Hi Greg,

On Sat, Feb 17, 2018 at 7:24 AM, Greg Kroah-Hartman
<gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> On Sat, Feb 17, 2018 at 07:12:17AM -0800, Guenter Roeck wrote:
>> On 02/17/2018 05:43 AM, Greg Kroah-Hartman wrote:
>> > On Fri, Feb 16, 2018 at 10:52:20AM -0800, Guenter Roeck wrote:
>> > > On Fri, Feb 16, 2018 at 10:10:44AM -0800, Brian Norris wrote:
>> > > > On Fri, Feb 16, 2018 at 07:48:50AM +0100, Greg Kroah-Hartman wrote:
>> > > > > On Thu, Feb 15, 2018 at 06:31:48PM -0800, Brian Norris wrote:
>> > > > > > On Thu, Feb 15, 2018 at 04:17:32PM +0100, Greg Kroah-Hartman wrote:
>> > > > > > > 4.4-stable review patch. If anyone has any objections, please let me know.
>> > > > > >
>> > > > > > Consider this an objection:
>> > > > > >
>> > > > > > I'm currently arguing that this is unnecessarily regressing power
>> > > > > > consumption here:
>> > > > > >
>> > > > > > https://patchwork.kernel.org/patch/10149195/
>> > > > > >
>> > > > > > I'll leave it up to you what to do with this, but if this ends up in
>> > > > > > Chromium OS kernels, I'm likely to revert it there...
...
>> > Thanks, but I've now released all of these with this patch committed, so
>> > we are now "bug compatible" :)

So, is that to say that the boilerplate above about objections is
meaningless? This is the second time that this same "feature" has been
pushed (degrading the quality of my systems) despite my objections,
under the banner of "bug compatibility" [1]. The first attempt to
revert was back around Dec 20 of last year, but I see that there were
10 "stable" 4.4 kernels released in the meantime [2] where that
original bug was still present. (Commit fd865802c66b "Bluetooth:
btusb: fix QCA Rome suspend/resume" was proven undeniably buggy.)

Next: we see this current valiant attempt at a less buggy fix, by
Hans. It's an OK solution, but it still wastes power for me. I
objected above, but instead of delaying applying it, you applied it in
the same release as you finally fixed the original crap (v4.4.116). So
all-in-all, my system (if using 4.4.x directly) hasn't had decent
Bluetooth since v4.4.99.

At least things are still moving forward here, and maybe in another
month, I can expect a v4.4.x stable kernel that works well. But the
hilarious current state of things is that we're basically going back
to a no-op for the time being:

https://marc.info/?l=linux-bluetooth&m=151981547905651&w=2
https://marc.info/?l=linux-bluetooth&m=151981548105654&w=2
[PATCH] Bluetooth: btusb: Remove Yoga 920 from the
btusb_needs_reset_resume_table

(I know others are looking at properly identifying a DMI match list
still, so this won't stay a no-op.)

>> FWIW, seems to me that trying to be "bug compatible" with -rc1 upstream
>> kernels may not really be a good idea for stable releases.

I couldn't agree more.

> It's a tough trade-off. If I dropped this patch, the normal mode of
> operation would be for it to get merged into device kernels and then
> forgotten about. Only if/when the user with the problem moves to a
> newer release a long time later would the regression normally appear
> again, and everyone would have to remember what happened and try to
> piece it all together again as to what commit caused the issue.

Note that I didn't suggest we have to completely drop the patch. And I
also don't suspect you need to delay all -rc1 bugfixes. I'd just
suggest delaying the patch for a few weeks, when there are objections
raised. (Or, reverting and scheduling to re-queue in a few weeks if no
progress...or something like that.) Is that not something that could
work, in order to keep "stable" releases *actually* stable? In most
software release processes, buggy patches are reverted as quickly as
possible while alternatives are worked out. Not all fixes are security
fixes that need to be out the door as soon as they see the light of
day...

> By you adding the revert to your device kernel now, you have a record of
> this being a problem, how upstream isn't fixing the issue, and when/if
> you do move to a newer kernel, that bugfix will still be there in your
> patch stack to forward port.

So, you rely entirely on device kernels to manage the pain that your
release process causes? We're actively trying to stay much closer to
upstream these days, and would essentially like to eliminate the
concept of "device" kernels, at least for Chrom{e,ium} OS, if
possible. But it's crap like this that proves that we can't.

> Yeah, you all are normally better than that, and I trust that you will
> push to get this resolved, hopefully soon. But for the most part, this
> method works best overall for the majority of the cases like this as not
> all bug reporters are persistent, and if not, the maintainer usually
> forgets about it as no one is saying anything and they have other things
> to work on.
>
> Well, bluetooth is known to not have responsive maintainers, so who am I
> kidding here, odds are it's only going to get fixed as Hans is
> involved, despite the bluetooth maintainers :)

You can't pin this completely on the bluetooth maintainers. *You*
maintain the -stable trees, yet you effectively ignored both of my
objections, forcing me to rely on said original maintainers to queue
up alternatives. Yes, yes, I know the "forcing me [and/or Hans] to
work" is basically working as intended for you, but the hard facts
show that *your* release was broken for far too long.

Brian

[1] BTW, I've had multiple people laugh at me when I mentioned this
phrase in explaining our predicament to people.
[2] v4.4.108 to v4.4.116