RE: Regression on Dell XPS13 (was: [char-misc for 4.10-rc4 V2] mei: bus: enable OS version only for SPT and newer)

From: Mario.Limonciello
Date: Tue Jan 17 2017 - 13:49:59 EST


> -----Original Message-----
> From: Greg KH [mailto:gregkh@xxxxxxxxxxxxxxxxxxx]
> Sent: Tuesday, January 17, 2017 12:24 PM
> To: Limonciello, Mario <Mario_Limonciello@xxxxxxxx>
> Cc: pmenzel@xxxxxxxxxxxxx; rafael.j.wysocki@xxxxxxxxx;
> linux@xxxxxxxxxxxxx; tomas.winkler@xxxxxxxxx; jan@xxxxxxxxxx;
> alexander.usyskin@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> yu.c.chen@xxxxxxxxx; tomi.p.sarvela@xxxxxxxxx; daniel@xxxxxxxxx;
> len.brown@xxxxxxxxx; linux-pm@xxxxxxxxxxxxxxx
> Subject: Re: Regression on Dell XPS13 (was: [char-misc for 4.10-rc4 V2] mei:
> bus: enable OS version only for SPT and newer)
>
> On Tue, Jan 17, 2017 at 04:57:49PM +0000, Mario.Limonciello@xxxxxxxx
> wrote:
> > So in the <6s scenario, the intel-hid driver is responsible to receive
> > the ACPI event and process accordingly. The maintainer has a patch
> > ready for the intel-hid portion of this work, but it's currently being
> > reviewed by Intel to ensure it can be legally submitted into the kernel.
>
> Who at Intel do I need to go kick to make this mythical legal review happen
> faster so we can see the code?
>
> Len and Rafael, what is going on here?
>

Len and Darren are both in the loop on the discussion around this patch.
I don't know if they'll have any (public) comments they can add on the matter
yet however.

We haven't broached the topic of the ACPI subsystem being frozen with Rafael
as it doesn't affect helping this problem until the intel-hid patch is added.

Rafael, since this is in process now, what are your thoughts?

> > > 406e79385f3223d82272cf2be86bc95cd000a258 is the first bad commit
> > > commit
> > > 406e79385f3223d82272cf2be86bc95cd000a258
> > > Author: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > > Date: Mon Nov 21 22:45:40 2016 +0100
> > >
> > > PM / sleep: System sleep state selection interface rework
> > >
> > > There are systems in which the platform doesn't support any special
> > > sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
> > > available system sleep state. However, some user space frameworks
> > > only use the "mem" and (sometimes) "standby" sleep state labels, so
> > > the users of those systems need to modify user space in order to be
> > > able to use system suspend at all and that may be a pain in practice.
> > >
> > > Commit 0399d4db3edf (PM / sleep: Introduce command line argument
> for
> > > sleep state enumeration) attempted to address this problem by adding
> > > a command line argument to change the meaning of the "mem" string
> in
> > > /sys/power/state to make it trigger suspend-to-idle (instead of
> > > suspend-to-RAM).
> > >
> > > However, there also are systems in which the platform does support
> > > special sleep states, but suspend-to-idle is the preferred one anyway
> > > (it even may save more energy than the platform-provided sleep
> states
> > > in some cases) and the above commit doesn't help in those cases.
> > >
> > > For this reason, rework the system sleep state selection interface
> > > again (but preserve backwards compatibiliby). Namely, add a new
> > > sysfs file, /sys/power/mem_sleep, that will control the system
> > > suspend mode triggered by writing "mem" to /sys/power/state (in
> > > analogy with what /sys/power/disk does for hibernation). Make it
> > > select suspend-to-RAM ("deep" sleep) by default (if supported) and
> > > fall back to suspend-to-idle ("s2idle") otherwise and add a new
> > > command line argument, mem_sleep_default, allowing that default to
> > > be overridden if need be.
> > >
> > > At the same time, drop the relative_sleep_states command line
> > > argument that doesn't make sense any more.
> > >
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > > Tested-by: Mario Limonciello <mario.limonciello@xxxxxxxx>
> > >
> > > :040000 040000 a5770fe0413cbe7794eab28f72dfe8ede1f090c2
> > > a2882c77659517aa7137c930e0e7f178bc76bfbd M Documentation
> > > :040000 040000 2594b1a87815173e97199ef9d9c5918fec22fcfd
> > > fe0b69953be653644c5366ac631131fdbbdb9bcc M kernel
> > > $ git tag --contains 406e79385f3223d82272cf2be86bc95cd000a258
> > > v4.10-rc1
> > > v4.10-rc2
> > > v4.10-rc3
> > > ```
> > >
> > > Please find the config and the bisection log attached. Sometimes I
> > > had to cherry-pick build fix commits for PIE enabled GCC on top.
> > >
> > > Rafael, do you want me to open a bug report for that? Mario, what
> > > systems did you actually test this on? (Why isnât that listed in the
> > > commit message?) Mario, do you have access to Dell XPS13 (9360)
> > > devices to help getting this fixed?
> > >
> >
> > I tested this on several systems and ensured that the kernel was doing
> > the right thing in terms of choosing the correct state to go into.
> >
> > As I mentioned above, those behaviors are currently expected until
> > these types issues are identified and fixed in the proper subsystems.
> > If they end up being troublesome to resolve, it's possible to quirk
> > individual systems, to disable this behavior and return to traditional
> > S3 but I would prefer to actually identify and fix the various problems so
> that we can push the Linux stack forward.
>
> If the machine stops working on a newer kernel when it used to work
> before, then you need to either revert the change, or provide a fix for it.
>

So in my mind there isn't exactly a clear cut definition of "stops working",
particularly when it's an intentional change in behavior.

If the desire is to go back to S3 on this system, I'm happy to submit a patch
that will quirk this back to S3 on this system until we can get ACPI to not
freeze and Intel-HID to pick up the event it needs.

> I think I might have access to a newer Dell XPS13 now, I'll try to set it up
> tomorrow to test 4.10-rc4 out...
>

If you don't, please contact me privately and I'm happy to try to help get
you access to one.