Re: Re: [PATCH] zram: Prevent page allocation failure during zcomp_strm_alloc

From: kyeongdon.kim
Date: Thu Nov 19 2015 - 08:49:17 EST


Hello,
On 2015-11-19 ìí 6:45, Sergey Senozhatsky wrote:
> Hello,
>
> On (11/19/15 15:54), kyeongdon.kim wrote:
>> When we use lzo/lz4 multi compression streams for zram,
>> we might face page allocation failure. In order to make parallel
>> compression private data, we should call kzalloc() with order 2/3
>> in runtime. But if there is no order 2/3 size memory to allocate
>> in that time, page allocation fails.
>>
>> This patch makes new zstrm stream list in zcomp_create() or
>> max_comp_streams set, which means calling (kzalloc()) in runtime
>> is not needed. This prevents page alloc failure.
>>
>
> The decision here is to not create unneded streams (streams are not
> free) -- we create one we need it. And if allocation fails we just
> wait for already existing stream to become available (and we are
> guaranteed to have at least one stream)
>
> zcomp_strm_multi_find
> {
> ...
> while (1) {
> spin_lock(&zs->strm_lock);
> if (!list_empty(&zs->idle_strm)) {
> zstrm = list_entry(zs->idle_strm.next,
> struct zcomp_strm, list);
> list_del(&zstrm->list);
> spin_unlock(&zs->strm_lock);
> return zstrm;
> }
> ...
> /* allocate new zstrm stream */
> zs->avail_strm++;
> spin_unlock(&zs->strm_lock);
>
> zstrm = zcomp_strm_alloc(comp);
> if (!zstrm) {
> spin_lock(&zs->strm_lock);
> zs->avail_strm--;
> spin_unlock(&zs->strm_lock);
> wait_event(zs->strm_wait, !list_empty(&zs->idle_strm));
> continue;
> }
> break;
> }
> return zstrm;
> }
>
>
> So, are you trying to fix the warning from zcomp_strm_multi_find()? What
> prevents this warning from happening in zcomp_strm_multi_set_max_streams()
> or zcomp_strm_multi_create()? e.g. I requested 64 streams and ended up not
> having enought memory? The warning will still be there and the streams will
> not be allocated. What's the gain?
>
>
> (besides, your implementation is racy. ->list and ->avail_strm should
> not be modified outside of ->strm_lock).
>
> -ss
>

I know what you mean (streams are not free).
First of all, I'm sorry I would have to tell you why I try this patch.
When we're using LZ4 multi stream for zram swap, I found out this
allocation failure message in system running test,
That was not only once, but a few. Also, some failure cases were
continually occurring to try allocation order 3.
I believed that status makes time delay issue to process launch.
So I tried to modify this by pre-allocating and solved it, even if there
was small memory using in advance.

But because of this patch, if there is an unneeded memory using.
I want to try new patch from another angle.
Firstly, we can change 'vmalloc()' instead of 'kzalloc()' for the
'zstrm->private".
Secondly, we can use GFP_NOWAIT flag to avoid this warning after 2nd
stream allocation.
How do you think about it? (btw, sorry for spinlock missing)
Thanks,

>> For reference a call trace :
>>
>> Binder_1: page allocation failure: order:3, mode:0x10c0d0
>> CPU: 0 PID: 424 Comm: Binder_1 Tainted: GW 3.10.49-perf-g991d02b-dirty
> #20
>> Call trace:
>> [<ffffffc0002069c8>] dump_backtrace+0x0/0x270
>> [<ffffffc000206c48>] show_stack+0x10/0x1c
>> [<ffffffc000cb51c8>] dump_stack+0x1c/0x28
>> [<ffffffc0002bbfc8>] warn_alloc_failed+0xfc/0x11c
>> [<ffffffc0002bf518>] __alloc_pages_nodemask+0x724/0x7f0
>> [<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
>> [<ffffffc0002ed6a4>] kmalloc_order_trace+0x38/0xd8
>> [<ffffffc0005d9738>] zcomp_lz4_create+0x2c/0x38
>> [<ffffffc0005d78f0>] zcomp_strm_alloc+0x34/0x78
>> [<ffffffc0005d7a58>] zcomp_strm_multi_find+0x124/0x1ec
>> [<ffffffc0005d7c14>] zcomp_strm_find+0xc/0x18
>> [<ffffffc0005d8fa0>] zram_bvec_rw+0x2fc/0x780
>> [<ffffffc0005d9680>] zram_make_request+0x25c/0x2d4
>> [<ffffffc00040f8ac>] generic_make_request+0x80/0xbc
>> [<ffffffc00040f98c>] submit_bio+0xa4/0x15c
>> [<ffffffc0002e8bb0>] __swap_writepage+0x218/0x230
>> [<ffffffc0002e8c04>] swap_writepage+0x3c/0x4c
>> [<ffffffc0002c7384>] shrink_page_list+0x51c/0x8d0
>> [<ffffffc0002c7e88>] shrink_inactive_list+0x3f8/0x60c
>> [<ffffffc0002c86c8>] shrink_lruvec+0x33c/0x4cc
>> [<ffffffc0002c8894>] shrink_zone+0x3c/0x100
>> [<ffffffc0002c8c10>] try_to_free_pages+0x2b8/0x54c
>> [<ffffffc0002bf308>] __alloc_pages_nodemask+0x514/0x7f0
>> [<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
>> [<ffffffc0003446cc>] proc_info_read+0x50/0xe4
>> [<ffffffc0002f5204>] vfs_read+0xa0/0x12c
>> [<ffffffc0002f59c8>] SyS_read+0x44/0x74
>> DMA: 3397*4kB (MC) 26*8kB (RC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB
>> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 13796kB
>>
>> Signed-off-by: kyeongdon.kim <kyeongdon.kim@xxxxxxx>
>> ---
>> drivers/block/zram/zcomp.c | 33 ++++++++++++++++++++++++++++-----
>> 1 file changed, 28 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
>> index f1ff39a..45f39b9 100644
>> --- a/drivers/block/zram/zcomp.c
>> +++ b/drivers/block/zram/zcomp.c
>> @@ -158,6 +158,21 @@ static bool
> zcomp_strm_multi_set_max_streams(struct zcomp *comp, int num_strm)
>> struct zcomp_strm_multi *zs = comp->stream;
>> struct zcomp_strm *zstrm;
>>
>> + if (zs->avail_strm < num_strm) {
>> + while (1) {
>> + /* allocate new zstrm stream */
>> + zs->avail_strm++;
>> + zstrm = zcomp_strm_alloc(comp);
>> + if (!zstrm) {
>> + zs->avail_strm--;
>> + kfree(zs);
>> + return false;
>> + }
>> + list_add(&zstrm->list, &zs->idle_strm);
>> + if (zs->avail_strm == num_strm)
>> + break;
>> + }
>> + }
>> spin_lock(&zs->strm_lock);
>> zs->max_strm = num_strm;
>> /*
>> @@ -209,12 +224,20 @@ static int zcomp_strm_multi_create(struct zcomp
> *comp, int max_strm)
>> zs->max_strm = max_strm;
>> zs->avail_strm = 1;
>>
>> - zstrm = zcomp_strm_alloc(comp);
>> - if (!zstrm) {
>> - kfree(zs);
>> - return -ENOMEM;
>> + while (1) {
>> + /* allocate new zstrm stream */
>> + zs->avail_strm++;
>> + zstrm = zcomp_strm_alloc(comp);
>> + if (!zstrm) {
>> + zs->avail_strm--;
>> + kfree(zs);
>> + return -ENOMEM;
>> + }
>> + list_add(&zstrm->list, &zs->idle_strm);
>> + if (zs->avail_strm == max_strm)
>> + break;
>> }
>> - list_add(&zstrm->list, &zs->idle_strm);
>> +
>> return 0;
>> }
>>
>> --
>> 1.8.2
>>
With Best Regards,
Kyeongdon Kim

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/