Re: [PATCH 4/5] PCI / PM: Check for error when reading Power State

From: Bjorn Helgaas
Date: Tue Aug 13 2019 - 21:08:22 EST


On Wed, Aug 14, 2019 at 12:59:26AM +0200, Rafael J. Wysocki wrote:
> On Saturday, August 10, 2019 12:01:16 AM CEST Bjorn Helgaas wrote:
> > On Mon, Aug 05, 2019 at 11:09:19PM +0200, Rafael J. Wysocki wrote:
> > > On Mon, Aug 5, 2019 at 10:52 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:

> > > > @@ -942,7 +942,7 @@ void pci_update_current_state(struct pci_dev *dev, pci_power_t state)
> > > > u16 pmcsr;
> > > >
> > > > pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
> > > > - dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
> > > > + dev->current_state = pci_power_state(pmcsr);
> > >
> > > The if () branch above should cover the D3cold case, shouldn't it?
> >
> > You mean the "if (platform_pci_get_power_state(dev) == PCI_D3cold)"
> > test?
>
> Not exactly.
>
> I mean "if (platform_pci_get_power_state(dev) == PCI_D3cold ||
> !pci_device_is_present(dev))".

I don't see what you mean. The !pci_device_is_present(dev) test tells
us something about what the state of the device was at some time in
the past, but of course it doesn't say anything about whether reading
PCI_PM_CTRL will succeed, e.g.,

# dev is present and in D0
platform_pci_get_power_state(dev) == PCI_D3cold # currently false
!pci_device_is_present(dev) # currently false
# dev is surprise hot-removed or put in D3cold
pci_read_config_word(PCI_PM_CTRL, &pmcsr)
# pmcsr == ~0 (error response)

(Maybe going to D3cold is impossible, but it's pretty hard to prove
that. The hot-remove is definitely possible.)

> > platform_pci_get_power_state() returns PCI_UNKNOWN in some cases.
> > When that happens, might we not read PCI_PM_CTRL of a device in
> > D3cold? I think this also has the same hotplug question as above.
>
> Surprise hot-removal can take place at any time, in particular after setting
> current_state, so adding extra checks here doesn't prevent the value of
> it from becoming stale at least sometimes anyway.

Definitely. The point is not to prevent current_state from becoming
stale, it's to prevent us from interpreting ~0 data (known to be
invalid) as though it were a valid register value.

Bjorn