Re: [PATCH percpu/for-4.7-fixes 1/2] percpu: fix synchronization between chunk->map_extend_work and chunk destruction

From: Vlastimil Babka
Date: Thu May 26 2016 - 05:19:21 EST


On 05/25/2016 05:44 PM, Tejun Heo wrote:
Atomic allocations can trigger async map extensions which is serviced
by chunk->map_extend_work. pcpu_balance_work which is responsible for
destroying idle chunks wasn't synchronizing properly against
chunk->map_extend_work and may end up freeing the chunk while the work
item is still in flight.

This patch fixes the bug by rolling async map extension operations
into pcpu_balance_work.

Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
Reported-and-tested-by: Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx>
Reported-by: Vlastimil Babka <vbabka@xxxxxxx>
Reported-by: Sasha Levin <sasha.levin@xxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx # v3.18+
Fixes: 9c824b6a172c ("percpu: make sure chunk->map array has available space")

I didn't spot issues, but I'm not that familiar with the code, so it doesn't mean much. Just one question below:

---
mm/percpu.c | 57 ++++++++++++++++++++++++++++++++++++---------------------
1 file changed, 36 insertions(+), 21 deletions(-)

--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -112,7 +112,7 @@ struct pcpu_chunk {
int map_used; /* # of map entries used before the sentry */
int map_alloc; /* # of map entries allocated */
int *map; /* allocation map */
- struct work_struct map_extend_work;/* async ->map[] extension */
+ struct list_head map_extend_list;/* on pcpu_map_extend_chunks */

void *data; /* chunk data */
int first_free; /* no free below this */
@@ -166,6 +166,9 @@ static DEFINE_MUTEX(pcpu_alloc_mutex); /

static struct list_head *pcpu_slot __read_mostly; /* chunk list slots */

+/* chunks which need their map areas extended, protected by pcpu_lock */
+static LIST_HEAD(pcpu_map_extend_chunks);
+
/*
* The number of empty populated pages, protected by pcpu_lock. The
* reserved chunk doesn't contribute to the count.
@@ -395,13 +398,19 @@ static int pcpu_need_to_extend(struct pc
{
int margin, new_alloc;

+ lockdep_assert_held(&pcpu_lock);
+
if (is_atomic) {
margin = 3;

if (chunk->map_alloc <
- chunk->map_used + PCPU_ATOMIC_MAP_MARGIN_LOW &&
- pcpu_async_enabled)
- schedule_work(&chunk->map_extend_work);
+ chunk->map_used + PCPU_ATOMIC_MAP_MARGIN_LOW) {
+ if (list_empty(&chunk->map_extend_list)) {

So why this list_empty condition? Doesn't it deserve a comment then? And isn't using a list an overkill in that case?

Thanks.

+ list_add_tail(&chunk->map_extend_list,
+ &pcpu_map_extend_chunks);
+ pcpu_schedule_balance_work();
+ }
+ }
} else {
margin = PCPU_ATOMIC_MAP_MARGIN_HIGH;
}

[...]