Re: [PATCH v2 3/3] percpu: improve allocation success rate for non-GFP_KERNEL callers

From: Tahsin Erdogan
Date: Mon Feb 27 2017 - 15:40:00 EST


On Mon, Feb 27, 2017 at 12:29 PM, Tejun Heo <tj@xxxxxxxxxx> wrote:
> Hello,
>
> On Mon, Feb 27, 2017 at 12:27:08PM -0800, Tahsin Erdogan wrote:
>> A better example is the call path below:
>>
>> pcpu_alloc+0x68f/0x710
>> __alloc_percpu_gfp+0xd/0x10
>> __percpu_counter_init+0x55/0xc0
>> cfq_pd_alloc+0x3b2/0x4e0
>> blkg_alloc+0x187/0x230
>> blkg_create+0x489/0x670
>> blkg_lookup_create+0x9a/0x230
>> blkg_conf_prep+0x1fb/0x240
>> __cfqg_set_weight_device.isra.105+0x5c/0x180
>> cfq_set_weight_on_dfl+0x69/0xc0
>> cgroup_file_write+0x39/0x1c0
>> kernfs_fop_write+0x13f/0x1d0
>> __vfs_write+0x23/0x120
>> vfs_write+0xc2/0x1f0
>> SyS_write+0x44/0xb0
>> entry_SYSCALL_64_fastpath+0x18/0xad
>>
>> A failure in this call path gives grief to tools which are trying to
>> configure io
>> weights. We see occasional failures happen here shortly after reboots even
>> when system is not under any memory pressure. Machines with a lot of cpus
>> are obviously more vulnerable.
>
> Ah, absolutely, that's a stupid failure but we should be able to fix
> that by making the blkg functions take gfp mask and allocate
> accordingly, right? It'll probably take preallocation tricks because
> of locking but should be doable.

My initial goal was to allow calls to vmalloc(), but I now see the
challenges in that
approach.

Doing preallocations would probably work but not sure if that can be
done without
complicating code too much. Could you describe what you have in mind?