Re: [Nouveau] [PATCH 1/5] drm/nouveau: Prevent RPM callback recursion in suspend/resume paths
From: Rafael J. Wysocki
Date: Wed Jul 18 2018 - 17:49:11 EST
On Wed, Jul 18, 2018 at 10:11 PM, Lyude Paul <lyude@xxxxxxxxxx> wrote:
> On Wed, 2018-07-18 at 10:36 +0200, Lukas Wunner wrote:
>> On Wed, Jul 18, 2018 at 10:25:05AM +0200, Lukas Wunner wrote:
>> > The GPU contains an i2c subdevice for each connector with DDC lines.
>> > I believe those are modelled as children of the GPU's PCI device as
>> > they're accessed via mmio of the PCI device.
>> >
>> > The problem here is that when the GPU's PCI device runtime suspends,
>> > its i2c child device needs to be runtime active to suspend the MST
>> > topology. Catch-22.
>> >
>> > I don't know whether or not it's necessary to suspend the MST topology.
>> > I'm not an expert on DisplayPort MultiStream transport.
>> >
>> > BTW Lyude, in patch 4 and 5 of this series, you're runtime resuming
>> > pad->i2c->subdev.device->dev. Is this the PCI device or is it the i2c
>> > device? I'm always confused by nouveau's structs. In nvkm_i2c_bus_ctor()
>> > I can see that the device you're runtime resuming is the parent of the
>> > i2c_adapter:
>> >
>> > struct nvkm_device *device = pad->i2c->subdev.device;
>> > [...]
>> > bus->i2c.dev.parent = device->dev;
>> >
>> > If the i2c_adapter is a child of the PCI device, it's sufficient
>> > to runtime resume the i2c_adapter, i.e. bus->i2c.dev, and this will
>> > implicitly runtime resume its parent.
>>
>> Actually, having written all this I just remembered that we have this
>> in the documentation:
>>
>> 8. "No-Callback" Devices
>>
>> Some "devices" are only logical sub-devices of their parent and cannot
>> be
>> power-managed on their own. [...]
>>
>> Subsystems can tell the PM core about these devices by calling
>> pm_runtime_no_callbacks().
>>
>> So it might actually be sufficient to just call pm_runtime_no_callbacks()
>
> I would have hoped so, but unfortunately it seems that
> pm_runtime_no_callbacks() is already called by default for i2c adapters in
> i2c_register_adapter(). Unfortunately this really can't fix the problem
> though, because it will still try to runtime resume the parent device of the
> i2c adapter, which still leads to deadlocking in the runtime suspend/resume
> path.
Well, there has to be a way to suspend all that thing without
recursion or similar.
If the adapter has no callbacks, then how is it possible for those
callbacks to invoke any runtime PM helpers for any other devices?
> Additionally; I did play around with ignore_children, but unfortunately this
> isn't good enough either as it just means that our i2c devices won't wake the
> GPU up on access.
So on the one hand you want them to stay active over a suspend of the
parent and on the other hand you want the parent to resume before
them. Are these requirements really consistent with each other?
> I'm pretty stumped here on trying to figure out any clean way to handle this
> in the PM core if recursive resume calls are off the table. The only possible
> solution I could see to this is if we could disable execution of runtime
> callbacks in the context of a certain task (while all other tasks have to
> honor the runtime PM callbacks), do what we need to do in suspend, then re-
> enable them
>> for the i2c devices...
This sounds completely broken to me, sorry.