Re: [PATCH] ACPI: PCI: Fix device reference counting in acpi_get_pci_dev()

From: Ville Syrjälä
Date: Wed Oct 19 2022 - 13:41:04 EST


On Wed, Oct 19, 2022 at 11:53:26AM -0500, Bjorn Helgaas wrote:
> On Wed, Oct 19, 2022 at 11:54:42AM +0300, Ville Syrjälä wrote:
> > On Tue, Oct 18, 2022 at 07:34:03PM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > >
> > > Commit 63f534b8bad9 ("ACPI: PCI: Rework acpi_get_pci_dev()") failed
> > > to reference count the device returned by acpi_get_pci_dev() as
> > > expected by its callers which in some cases may cause device objects
> > > to be dropped prematurely.
> > >
> > > Add the missing get_device() to acpi_get_pci_dev().
> > >
> > > Fixes: 63f534b8bad9 ("ACPI: PCI: Rework acpi_get_pci_dev()")
> >
> > FYI this (and the rtc-cmos regression discussed in
> > https://lore.kernel.org/linux-acpi/5887691.lOV4Wx5bFT@kreacher/)
> > took down the entire Intel gfx CI.
>
> >From 1000 miles away and zero background with the gfx CI, this sounds
> like "our CI system, whose purpose is to find bugs, found one", which
> is a good thing.

Mostly. It's certainly better than it going entirely undetected.

Sadly we found it after rc1 because no one was really looking at
linux-next results. Something we need to improve.

But ideally it would have been found by some other CI system
whose primary job is to prevent bugs in those subsystems, rather
than the one whose primary job is to prevent bugs in gfx drivers.
Also ideally it wouldn't have been me bisecting this :P

The biggest downside of bugs reaching our CI via rc1/etc. is that
it pretty much stops everyone from getting premerge results for
their graphics driver patches since the CI keeps tripping over
the already existing bugs. But I guess you can call this one a
somewhat self inflicted wound and we should just try harder to
keep new code out of our tree until it's known to be healthy.

--
Ville Syrjälä
Intel