Re: [PATCH] ACPI / LPSS: Don't skip late system PM ops for hibernate on BYT/CHT
From: Robert R. Howell
Date: Sat May 25 2019 - 01:49:32 EST
On 5/16/19 5:11 AM, Rafael J. Wysocki wrote:
>
> On Thursday, April 25, 2019 6:38:34 PM CEST Robert R. Howell wrote:
>> On 4/24/19 1:20 AM, Rafael J. Wysocki wrote:
>>
>>> On Tue, Apr 23, 2019 at 10:03 PM Robert R. Howell <RHowell@xxxxxxxx> wrote:
>>>>
>>>> On 4/23/19 2:07 AM, Rafael J. Wysocki wrote:
>>>>>
>>>>> On Sat, Apr 20, 2019 at 12:44 AM Robert R. Howell <RHowell@xxxxxxxx> wrote:
>>>>>>
>>>>>> On 4/18/19 5:42 AM, Hans de Goede wrote:
>>>>>>
>>>>>>>> On 4/8/19 2:16 AM, Hans de Goede wrote:>
>>>>>>>>>
>>>>>>>>> Hmm, interesting so you have hibernation working on a T100TA
>>>>>>>>> (with 5.0 + 02e45646d53b reverted), right ?
>>>>>>>>>
>>>>>>
>>>>>>
>>>>>> I've managed to find a way around the i2c_designware timeout issues
>>>>>> on the T100TA's. The key is to NOT set DPM_FLAG_SMART_SUSPEND,
>>>>>> which was added in the 02e45646d53b commit.
>>>>>>
>>>>>> To test that I've started with a 5.1-rc5 kernel, applied your recent patch
>>>>>> to acpi_lpss.c, then apply the following patch of mine, removing
>>>>>> DPM_FLAG_SMART_SUSPEND. (For the T100 hardware I need to apply some
>>>>>> other patches as well but those are not related to the i2c-designware or
>>>>>> acpi issues addressed here.)
>>>>>>
>>>>>> On a resume from hibernation I still see one error:
>>>>>> "i2c_designware 80860F41:00: Error i2c_dw_xfer called while suspended"
>>>>>> but I no longer get the i2c_designware timeouts, and audio does now work
>>>>>> after the resume.
>>>>>>
>>>>>> Removing DPM_FLAG_SMART_SUSPEND may not be what you want for other
>>>>>> hardware, but perhaps this will give you a clue as to what is going
>>>>>> wrong with hibernate/resume on the T100TA's.
>>>>>
>>>>> What if you drop DPM_FLAG_LEAVE_SUSPENDED alone instead?
>>>>>
>>>>
>>>> I did try dropping just DPM_FLAG_LEAVE_SUSPENDED, dropping just
>>>> DPM_FLAG_SMART_SUSPEND, and dropping both flags. When I just drop
>>>> DPM_FLAG_LEAVE_SUSPENDED I still get the i2c_designware timeouts
>>>> after the resume. If I drop just DPM_FLAG_SMART_SUSPEND or drop both,
>>>> then the timeouts go away.
>>>
>>> OK, thanks!
>>>
>>> Is non-hibernation system suspend affected too?
>>
>> I just ran some tests on a T100TA, using the 5.1-rc5 code with Hans' patch applied
>> but without any changes to i2c-designware-platdrv.c, so the
>> DPM_FLAG_SMART_PREPARE, DPM_FLAG_SMART_SUSPEND, and DPM_FLAG_LEAVE_SUSPENDED flags
>> are all set.
>>
>> Suspend does work OK, and after resume I do NOT get any of the crippling
>> i2c_designware timeout errors which cause sound to fail after hibernate. I DO see one
>> "i2c_designware 80860F41:00: Error i2c_dw_xfer call while suspended"
>> error on resume, just as I do on hibernate. I've attached a portion of dmesg below.
>> The "asus_wmi: Unknown key 79 pressed" error is a glitch which occurs
>> intermittently on these machines, but doesn't seem related to the other issues.
>> I had one test run when it was absent but the rest of the messages were the
>> same -- but then kept getting that unknown key error on all my later tries.
>>
>> I did notice the "2sidle" in the following rather than "shallow" or "deep". A
>> cat of /sys/power/state shows "freeze mem disk" but a
>> cat of /sys/power/mem_sleep" shows only "[s2idle] so it looks like shallow and deep
>> are not enabled for this system. I did check the input power (or really current)
>> as it went into suspend and the micro-usb power input drops from about
>> 0.5 amps to 0.05 amps. But clearly a lot of devices are still active, as movement
>> of a bluetooth mouse (the MX Anywhere 2) will wake it from suspend. That presumably is
>> why suspend doesn't trigger the same i2c_designware problems as hibernate.
>>
>> Let me know if I can do any other tests.
>
> Can you please check if the appended patch makes the hibernate issue go away for you, without any other changes?
>
> ---
> drivers/pci/pci-driver.c | 36 ++++++++++--------------------------
> 1 file changed, 10 insertions(+), 26 deletions(-)
>
> Index: linux-pm/drivers/pci/pci-driver.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci-driver.c
> +++ linux-pm/drivers/pci/pci-driver.c
> @@ -957,15 +957,14 @@ static int pci_pm_freeze(struct device *
> }
>
> /*
> - * This used to be done in pci_pm_prepare() for all devices and some
> - * drivers may depend on it, so do it here. Ideally, runtime-suspended
> - * devices should not be touched during freeze/thaw transitions,
> - * however.
> + * Resume all runtime-suspended devices before creating a snapshot
> + * image of system memory, because the restore kernel generally cannot
> + * be expected to always handle them consistently and pci_pm_restore()
> + * always leaves them as "active", so ensure that the state saved in the
> + * image will always be consistent with that.
> */
> - if (!dev_pm_smart_suspend_and_suspended(dev)) {
> - pm_runtime_resume(dev);
> - pci_dev->state_saved = false;
> - }
> + pm_runtime_resume(dev);
> + pci_dev->state_saved = false;
>
> if (pm->freeze) {
> int error;
> @@ -992,9 +991,6 @@ static int pci_pm_freeze_noirq(struct de
> struct pci_dev *pci_dev = to_pci_dev(dev);
> struct device_driver *drv = dev->driver;
>
> - if (dev_pm_smart_suspend_and_suspended(dev))
> - return 0;
> -
> if (pci_has_legacy_pm_support(pci_dev))
> return pci_legacy_suspend_late(dev, PMSG_FREEZE);
>
> @@ -1024,16 +1020,6 @@ static int pci_pm_thaw_noirq(struct devi
> struct device_driver *drv = dev->driver;
> int error = 0;
>
> - /*
> - * If the device is in runtime suspend, the code below may not work
> - * correctly with it, so skip that code and make the PM core skip all of
> - * the subsequent "thaw" callbacks for the device.
> - */
> - if (dev_pm_smart_suspend_and_suspended(dev)) {
> - dev_pm_skip_next_resume_phases(dev);
> - return 0;
> - }
> -
> if (pcibios_pm_ops.thaw_noirq) {
> error = pcibios_pm_ops.thaw_noirq(dev);
> if (error)
> @@ -1093,8 +1079,10 @@ static int pci_pm_poweroff(struct device
>
> /* The reason to do that is the same as in pci_pm_suspend(). */
> if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) ||
> - !pci_dev_keep_suspended(pci_dev))
> + !pci_dev_keep_suspended(pci_dev)) {
> pm_runtime_resume(dev);
> + pci_dev->state_saved = false;
> + }
>
> pci_dev->state_saved = false;
> if (pm->poweroff) {
> @@ -1168,10 +1156,6 @@ static int pci_pm_restore_noirq(struct d
> struct device_driver *drv = dev->driver;
> int error = 0;
>
> - /* This is analogous to the pci_pm_resume_noirq() case. */
> - if (dev_pm_smart_suspend_and_suspended(dev))
> - pm_runtime_set_active(dev);
> -
> if (pcibios_pm_ops.restore_noirq) {
> error = pcibios_pm_ops.restore_noirq(dev);
> if (error)
>
>
>
I've finally managed to complete a reasonable set of tests on my T100TA using your
2nd patch from above, and on a 5.1.4 based kernel with ONLY this patch applied I can
successfully suspend and hibernate the system. On resume the i2c devices (in particular the sound)
do still work OK. On the first resume from suspend or hibernate I DO still see
the "i2c_designware 80860F41:00: Transfer while suspended" error, followed by a
stack trace. However as before that error doesn't seem to cause any permanent problems.
and with your patch applied I do NOT get the subsequent i2c timeout errors
that prevent further use of the sound system. I don't see any other reported errors.
As usual I need to have a systemd script which removes the brcmfmac and hci_uart drivers
before hibernate, then installs them on resume, or those drivers fail to work after resume.
I had initially tried testing your patch with the 5.1.0 kernel that I had been
using for testing a few weeks ago, and also the just released 5.2-rc1 kernel.
I think your patch is also fixing the i2c timeout errors with those, but with both of those
other kernels I have a lot of other issues which have confused the results. I've been trying
to sort out exactly what causes or avoids those issues over the last several days.
However that has turned into enough of a can of worms that unless you really want
to know those details, I'll avoid that "noise" here except a VERY brief summary.
With 5.1.0 and ONLY your patch, resume from hibernate does seem to work, and I DO NOT
get i2c_timeout errors, but I do see a number of errors and stack traces reported
from other i2c devices such as the ATML1000 touchscreen controller. However the devices
all do seem to work once the resume is completed. The just released 5.2-rc1 kernel
seems to have introduced problems with at least one other device, brcmfmac, with
or without your patch. I have to blacklist it or the system hangs on either suspend
or hibernate. (And working wifi never did come up even on the initial boot.)
But with brcmfmac blacklisted your patch seems to fix the i2c_designware SOUND timeouts.
Let me know if I can do any further testing.
Bob Howell