For non-atomic allocations, pcpu_alloc() can try to extend the area
map synchronously after dropping pcpu_lock; however, the extension
wasn't synchronized against chunk destruction and the chunk might get
freed while extension is in progress.
This patch fixes the bug by putting most of non-atomic allocations
under pcpu_alloc_mutex to synchronize against pcpu_balance_work which
is responsible for async chunk management including destruction.
Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
Reported-and-tested-by: Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx>
Reported-by: Vlastimil Babka <vbabka@xxxxxxx>
Reported-by: Sasha Levin <sasha.levin@xxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx # v3.18+
Fixes: 1a4d76076cda ("percpu: implement asynchronous chunk population")