Re: 5.17-rc regression: rmi4 clients cannot deal with asynchronous suspend? (was: X1 Carbon touchpad not resumed)

From: Rajat Jain
Date: Mon Feb 07 2022 - 16:42:16 EST


+linux-input@xxxxxxxxxxxxxxx

On Mon, Feb 7, 2022 at 1:09 PM Rajat Jain <rajatja@xxxxxxxxxx> wrote:
>
> +Rafael (for any inputs on asynchronous suspend / resume)
> +Dmitry Torokhov (since no other maintainer of rmi4 in MAINTAINERS file)
> +loic.poulain@xxxxxxxxxx (who fixed RMI device hierarchy recently)
> + Some Synaptics folks (from recent commits - Vincent Huang, Andrew
> Duggan, Cheiny)
>
> On Mon, Feb 7, 2022 at 12:23 PM Wolfram Sang <wsa@xxxxxxxxxx> wrote:
> >
> > Hello Hugh,
> >
> > > Bisection led to 172d931910e1db800f4e71e8ed92281b6f8c6ee2
> > > ("i2c: enable async suspend/resume on i2c client devices")
> > > and reverting that fixes it for me.
> >
> > Thank you for the report plus bisection and sorry for the regression!
>
> +1, Thanks for the bisection, and apologies for the inconveniences.
>
> The problem here seems to be that for some reason, some devices (all
> connected to rmi4 adapter) failed to resume, but only when
> asynchronous suspend is enabled (by 172d931910e1):
>
> [ 79.221064] rmi4_smbus 6-002c: failed to get SMBus version number!
> [ 79.265074] rmi4_physical rmi4-00: rmi_driver_reset_handler: Failed
> to read current IRQ mask.
> [ 79.308330] rmi4_f01 rmi4-00.fn01: Failed to restore normal operation: -6.
> [ 79.308335] rmi4_f01 rmi4-00.fn01: Resume failed with code -6.
> [ 79.308339] rmi4_physical rmi4-00: Failed to suspend functions: -6
> [ 79.308342] rmi4_smbus 6-002c: Failed to resume device: -6
> [ 79.351967] rmi4_physical rmi4-00: Failed to read irqs, code=-6
>
> A resume failure that only shows up during asynchronous resume,
> typically means that the device is dependent on some other device to
> resume first, but this dependency is NOT established in a parent child
> relationship (which is wrong and needs to be fixed, perhaps using
> device_add_link()). Thus the kernel may be resuming these devices
> without first resuming some other device that these devices need to
> depend on.
>
> TBH, I'm not sure how to fix this. The only hint I see is that all of
> these devices seem to be attached to rmi4 device so perhaps something
> there? I see 6e4860410b828f recently fixed device hierarchy for rmi4,
> and so seemingly should have fixed this very issue (as also seen in
> commit message)?
>
> >
> > I will wait a few days if people come up with a fix. If not, I will
> > revert the offending commit.
>
> While I'll be sad because this means no i2c-client can now resume in
> parallel and increases resume latency by a *LOT* (hundreds of ms on
> all Linux systems), I understand that this needs to be done unless
> someone comes up with a fix.
>
> I wanted to confirm that the following patches shall continue to stay?
>
> d320ec7acc83 i2c: enable async suspend/resume for i2c adapters
> 7c5b3c158b38 i2c: designware: Enable async suspend / resume of
> designware devices
>
> Thanks & Best Regards,
>
> Rajat
>
>
> >
> > All the best,
> >
> > Wolfram
> >