Re: [PATCH -v2] IRQ, cpu-hotplug: Fix a race between CPU hotplug and IRQ desc alloc/free

From: Thomas Gleixner
Date: Tue Sep 05 2017 - 04:13:54 EST


On Tue, 5 Sep 2017, Huang, Ying wrote:

> From: Huang Ying <ying.huang@xxxxxxxxx>
>
> When developing code to bootup some APs (Application CPUs)
> asynchronously, the following kernel panic is encountered. After
> checking the code, it is found that the irq_to_desc() may return NULL
> during CPU hotplug. So the NULL pointer checking is added to fix
> this.

You forgot to describe why this can happen. "After checking the code" is
not really helpful for someone who looks at that commit.

for_each_active_irq() is iterated with the sparse lock held in both cases
(cpu up and down). So if there is an active bit in the sparse map and the
the radix tree entry is empty then there is an inconsistency. The
inconsistency originates from the way the irq descriptor allocation/free is
implemented. The bitmap is set/cleared seperately from the actual pointer
store/remove in the radix tree:

The allocation side:

irq_sparse_lock();
bitmap_set();
irq_sparse_unlock();

desc = alloc();
irq_sparse_lock();
store_in_radix_tree(irq, desc);
irq_sparse_unlock();

The deallocation side:

irq_sparse_lock();
store_in_radix_tree(irq, NULL);
irq_sparse_unlock();

irq_sparse_lock();
bitmap_clear();
irq_sparse_unlock();

So the real question is, whether we keep it that way and have the extra
checks all over the place or simply extend the protected sections in the
alloc/free path.

Untested patch below.

Thanks,

tglx
8<----------------

kernel/irq/irqdesc.c | 17 ++++-------------
1 file changed, 4 insertions(+), 13 deletions(-)

Index: b/kernel/irq/irqdesc.c
===================================================================
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -421,10 +421,8 @@ static void free_desc(unsigned int irq)
* The sysfs entry must be serialized against a concurrent
* irq_sysfs_init() as well.
*/
- mutex_lock(&sparse_irq_lock);
kobject_del(&desc->kobj);
delete_irq_desc(irq);
- mutex_unlock(&sparse_irq_lock);

/*
* We free the descriptor, masks and stat fields via RCU. That
@@ -462,20 +460,14 @@ static int alloc_descs(unsigned int star
desc = alloc_desc(start + i, node, flags, mask, owner);
if (!desc)
goto err;
- mutex_lock(&sparse_irq_lock);
irq_insert_desc(start + i, desc);
irq_sysfs_add(start + i, desc);
- mutex_unlock(&sparse_irq_lock);
}
return start;

err:
for (i--; i >= 0; i--)
free_desc(start + i);
-
- mutex_lock(&sparse_irq_lock);
- bitmap_clear(allocated_irqs, start, cnt);
- mutex_unlock(&sparse_irq_lock);
return -ENOMEM;
}

@@ -670,10 +662,10 @@ void irq_free_descs(unsigned int from, u
if (from >= nr_irqs || (from + cnt) > nr_irqs)
return;

+ mutex_lock(&sparse_irq_lock);
for (i = 0; i < cnt; i++)
free_desc(from + i);

- mutex_lock(&sparse_irq_lock);
bitmap_clear(allocated_irqs, from, cnt);
mutex_unlock(&sparse_irq_lock);
}
@@ -727,10 +719,9 @@ int __ref
if (ret)
goto err;
}
-
- bitmap_set(allocated_irqs, start, cnt);
- mutex_unlock(&sparse_irq_lock);
- return alloc_descs(start, cnt, node, affinity, owner);
+ ret = alloc_descs(start, cnt, node, affinity, owner);
+ if (ret >= 0)
+ bitmap_set(allocated_irqs, start, cnt);

err:
mutex_unlock(&sparse_irq_lock);