Re: [PATCH] PCI: Add a mutex to protect the global list pci_domain_busn_res_list

From: Bjorn Helgaas
Date: Thu Apr 25 2024 - 18:51:46 EST


On Thu, Apr 18, 2024 at 06:53:02PM -0700, Dexuan Cui wrote:
> There has been an effort to make the pci-hyperv driver support
> async-probing to reduce the boot time. With async-probing, multiple
> kernel threads can be running hv_pci_probe() -> create_root_hv_pci_bus() ->
> pci_scan_root_bus_bridge() -> pci_bus_insert_busn_res() at the same time to
> update the global list, causing list corruption.
>
> Add a mutex to protect the list.

I think it's a good idea to support probing multiple PCI root buses in
parallel.

The problem in get_pci_domain_busn_res() is the global
pci_domain_busn_res_list. I'm not even sure what that list contains,
since it's a lookup by "domain_nr". In the hv case, you probably have
one host bridge per domain, but in general there may be multiple root
buses in the same domain, e.g.,

ACPI: PCI Root Bridge [PC00] (domain 0000 [bus 00-16])
ACPI: PCI Root Bridge [PC01] (domain 0000 [bus 17-39])
ACPI: PCI Root Bridge [PC02] (domain 0000 [bus 3a-5c])
...

We only use get_pci_domain_busn_res() for root buses, and we should
know the bus number range for root buses when we set up the struct
pci_host_bridge, so it seems like we should keep the bus number
resource there instead of allocating it in this sort of random place.

Then we shouldn't need this weird pci_domain_busn_res_list at all.

> Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx>
> ---
> drivers/pci/probe.c | 25 ++++++++++++++++++-------
> 1 file changed, 18 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index e19b79821dd6..1327fd820b24 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -37,6 +37,7 @@ LIST_HEAD(pci_root_buses);
> EXPORT_SYMBOL(pci_root_buses);
>
> static LIST_HEAD(pci_domain_busn_res_list);
> +static DEFINE_MUTEX(pci_domain_busn_res_list_lock);
>
> struct pci_domain_busn_res {
> struct list_head list;
> @@ -47,14 +48,22 @@ struct pci_domain_busn_res {
> static struct resource *get_pci_domain_busn_res(int domain_nr)
> {
> struct pci_domain_busn_res *r;
> + struct resource *ret;
>
> - list_for_each_entry(r, &pci_domain_busn_res_list, list)
> - if (r->domain_nr == domain_nr)
> - return &r->res;
> + mutex_lock(&pci_domain_busn_res_list_lock);
> +
> + list_for_each_entry(r, &pci_domain_busn_res_list, list) {
> + if (r->domain_nr == domain_nr) {
> + ret = &r->res;
> + goto out;
> + }
> + }
>
> r = kzalloc(sizeof(*r), GFP_KERNEL);
> - if (!r)
> - return NULL;
> + if (!r) {
> + ret = NULL;
> + goto out;
> + }
>
> r->domain_nr = domain_nr;
> r->res.start = 0;
> @@ -62,8 +71,10 @@ static struct resource *get_pci_domain_busn_res(int domain_nr)
> r->res.flags = IORESOURCE_BUS | IORESOURCE_PCI_FIXED;
>
> list_add_tail(&r->list, &pci_domain_busn_res_list);
> -
> - return &r->res;
> + ret = &r->res;
> +out:
> + mutex_unlock(&pci_domain_busn_res_list_lock);
> + return ret;
> }
>
> /*
> --
> 2.25.1
>