Re: [PATCH v4] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges

From: Mika Westerberg
Date: Wed Nov 27 2019 - 06:49:02 EST


On Tue, Nov 26, 2019 at 06:10:36PM -0500, Lyude Paul wrote:
> Hey-this is almost certainly not the right place in this thread to respond,
> but this thread has gotten so deep evolution can't push the subject further to
> the right, heh. So I'll just respond here.

:)

> I've been following this and helping out Karol with testing here and there.
> They had me test Bjorn's PCI branch on the X1 Extreme 2nd generation, which
> has a turing GPU and 8086:1901 PCI bridge.
>
> I was about to say "the patch fixed things, hooray!" but it seems that after
> trying runtime suspend/resume a couple times things fall apart again:

You mean $subject patch, no?

> [ 686.883247] nouveau 0000:01:00.0: DRM: suspending object tree...
> [ 752.866484] ACPI Error: Aborting method \_SB.PCI0.PEG0.PEGP.NVPO due to previous error (AE_AML_LOOP_TIMEOUT) (20190816/psparse-529)
> [ 752.866508] ACPI Error: Aborting method \_SB.PCI0.PGON due to previous error (AE_AML_LOOP_TIMEOUT) (20190816/psparse-529)
> [ 752.866521] ACPI Error: Aborting method \_SB.PCI0.PEG0.PG00._ON due to previous error (AE_AML_LOOP_TIMEOUT) (20190816/psparse-529)

This is probably the culprit. The same AML code fails to properly turn
on the device.

Is acpidump from this system available somewhere?