Re: Re: Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()

From: kyeongdon.kim
Date: Mon Nov 30 2015 - 21:32:30 EST


On 2015-12-01 ìì 8:18, Minchan Kim wrote:
> Hi Kyeongdon,
>
> On Mon, Nov 30, 2015 at 07:42:02PM +0900, kyeongdon.kim wrote:
>
>> > On Fri, Nov 27, 2015 at 01:10:49PM +0900, Sergey Senozhatsky wrote:
>> >> From: Kyeongdon Kim <kyeongdon.kim@xxxxxxx>
>> >>
>> >> When we're using LZ4 multi compression streams for zram swap,
>> >> we found out page allocation failure message in system running test.
>> >> That was not only once, but a few(2 - 5 times per test).
>> >> Also, some failure cases were continually occurring to try allocation
>> >> order 3.
>> >>
>> >> In order to make parallel compression private data, we should call
>> >> kzalloc() with order 2/3 in runtime(lzo/lz4). But if there is no order
>> >> 2/3 size memory to allocate in that time, page allocation fails.
>> >> This patch makes to use vmalloc() as fallback of kmalloc(), this
>> >> prevents page alloc failure warning.
>> >>
>> >> After using this, we never found warning message in running test, also
>> >> It could reduce process startup latency about 60-120ms in each case.
>> >>
>> >> For reference a call trace :
>> >>
>> >> Binder_1: page allocation failure: order:3, mode:0x10c0d0
>> >> CPU: 0 PID: 424 Comm: Binder_1 Tainted: GW 3.10.49-perf-g991d02b-dirty
>> > #20
>> >> Call trace:
>> >> [<ffffffc0002069c8>] dump_backtrace+0x0/0x270
>> >> [<ffffffc000206c48>] show_stack+0x10/0x1c
>> >> [<ffffffc000cb51c8>] dump_stack+0x1c/0x28
>> >> [<ffffffc0002bbfc8>] warn_alloc_failed+0xfc/0x11c
>> >> [<ffffffc0002bf518>] __alloc_pages_nodemask+0x724/0x7f0
>> >> [<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
>> >> [<ffffffc0002ed6a4>] kmalloc_order_trace+0x38/0xd8
>> >> [<ffffffc0005d9738>] zcomp_lz4_create+0x2c/0x38
>> >> [<ffffffc0005d78f0>] zcomp_strm_alloc+0x34/0x78
>> >> [<ffffffc0005d7a58>] zcomp_strm_multi_find+0x124/0x1ec
>> >> [<ffffffc0005d7c14>] zcomp_strm_find+0xc/0x18
>> >> [<ffffffc0005d8fa0>] zram_bvec_rw+0x2fc/0x780
>> >> [<ffffffc0005d9680>] zram_make_request+0x25c/0x2d4
>> >> [<ffffffc00040f8ac>] generic_make_request+0x80/0xbc
>> >> [<ffffffc00040f98c>] submit_bio+0xa4/0x15c
>> >> [<ffffffc0002e8bb0>] __swap_writepage+0x218/0x230
>> >> [<ffffffc0002e8c04>] swap_writepage+0x3c/0x4c
>> >> [<ffffffc0002c7384>] shrink_page_list+0x51c/0x8d0
>> >> [<ffffffc0002c7e88>] shrink_inactive_list+0x3f8/0x60c
>> >> [<ffffffc0002c86c8>] shrink_lruvec+0x33c/0x4cc
>> >> [<ffffffc0002c8894>] shrink_zone+0x3c/0x100
>> >> [<ffffffc0002c8c10>] try_to_free_pages+0x2b8/0x54c
>> >> [<ffffffc0002bf308>] __alloc_pages_nodemask+0x514/0x7f0
>> >> [<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
>> >> [<ffffffc0003446cc>] proc_info_read+0x50/0xe4
>> >> [<ffffffc0002f5204>] vfs_read+0xa0/0x12c
>> >> [<ffffffc0002f59c8>] SyS_read+0x44/0x74
>> >> DMA: 3397*4kB (MC) 26*8kB (RC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB
>> >> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 13796kB
>> >>
>> >> [minchan: change vmalloc gfp and adding comment about gfp]
>> >> [sergey: tweak comments and styles]
>> >> Signed-off-by: Kyeongdon Kim <kyeongdon.kim@xxxxxxx>
>> >> Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx>
>> >
>> > Kyeongdon, Could you test this patch on your device?
>> >
>> > Thanks.
>>
>> Sorry to have kept you waiting,
>> Obviously, I couldn't see allocation fail message with this patch.
>> But, there is something to make some delay(not sure yet this is normal).
>
> You mean new changes makes start-up delay of your application sometime
> still,
> but not frequent like old?
>
I couldn't see start-up delay during my test after this patch.
But, I checked the return value from alloc function like the below :

static void *zcomp_lz4_create(void)
<snip>
ret = kzalloc(LZ4_MEM_COMPRESS, GFP_NOIO | __GFP_NORETRY |
__GFP_NOWARN | __GFP_NOMEMALLOC);
printk("%s: %d: ret = %p\n",__func__,__LINE__,ret); //line 32
if (!ret) {
ret = __vmalloc(LZ4_MEM_COMPRESS,
GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN |
__GFP_NOMEMALLOC | __GFP_ZERO | __GFP_HIGHMEM,
PAGE_KERNEL);
printk("%s: %d: ret = %p\n",__func__,__LINE__,ret); //line 38
}
return ret;

log message :
[ 352.226014][0] zcomp_lz4_create: 32: ret = (null)
[ 352.226035][0] zcomp_lz4_create: 38: ret = (null)
[ 352.226791][0] zcomp_lz4_create: 32: ret = (null)
[ 352.226809][0] zcomp_lz4_create: 38: ret = (null)
[ 352.230348][0] zcomp_lz4_create: 32: ret = (null)
[ 352.230369][0] zcomp_lz4_create: 38: ret = (null)
[ 352.230460][0] zcomp_lz4_create: 32: ret = (null)
[ 352.230485][0] zcomp_lz4_create: 38: ret = (null)
[ 352.230507][0] zcomp_lz4_create: 32: ret = (null)
[ 352.230520][0] zcomp_lz4_create: 38: ret = (null)
[ 352.230608][0] zcomp_lz4_create: 32: ret = (null)
[ 352.230619][0] zcomp_lz4_create: 38: ret = (null)
[ 352.230888][0] zcomp_lz4_create: 32: ret = (null)
[ 352.230902][0] zcomp_lz4_create: 38: ret = (null)
[ 352.231406][0] zcomp_lz4_create: 32: ret = ffffffc002088000
[ 352.234024][0] zcomp_lz4_create: 32: ret = (null)
[ 352.234060][0] zcomp_lz4_create: 38: ret = (null)
[ 352.234359][0] zcomp_lz4_create: 32: ret = (null)
[ 352.234384][0] zcomp_lz4_create: 38: ret = (null)
[ 352.234618][0] zcomp_lz4_create: 32: ret = (null)
[ 352.234639][0] zcomp_lz4_create: 38: ret = (null)
[ 352.234667][0] zcomp_lz4_create: 32: ret = (null)
[ 352.234685][0] zcomp_lz4_create: 38: ret = (null)
[ 352.234738][0] zcomp_lz4_create: 32: ret = (null)
[ 352.234748][0] zcomp_lz4_create: 38: ret = (null)
[ 352.234800][0] zcomp_lz4_create: 32: ret = (null)
[ 352.234816][0] zcomp_lz4_create: 38: ret = (null)
[ 352.234852][0] zcomp_lz4_create: 32: ret = (null)
[ 352.234865][0] zcomp_lz4_create: 38: ret = (null)
[ 352.235136][0] zcomp_lz4_create: 32: ret = (null)
[ 352.235179][0] zcomp_lz4_create: 38: ret = ffffff80016a4000

I thought this pattern from vmalloc is not normal.
>>
>> static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
>> {
>> <snip>
>>
>> zstrm->private = comp->backend->create();
>> ^ // sometimes, return 'null' continually(2-5times)
>
> Hmm, I think it is caused by __GFP_NOMEMALLOC.
> Could you test it without the flag?
>
>>
>> As you know, if there is 'null' return, this function is called again to
>> get a memory in while() loop. I just checked this one with printk().
>>
>> If you guys don't mind, I'll test more with trace log to check time
> delay.
>
> No problem.
>
>>
>> However, If this is fully expectable status to you.
>> I think I don't need to do it.
>
> It's not what I expected. Actually, I thought failure of vmalloc
> in that place should be *really really* rare. I think it's caused by
> __GFP_NOMEMALLOC so I want to see test result without the flag.
>
> Thanks for the careful test!
>
You're welcome.

After I removed flag '__GFP_NOMEMALLOC', I couldn't find return 'null'
from vmalloc until now.

log message :
<4>[ 2288.954934][0] KDKIM: zcomp_lz4_create: 24: ret = (null)
<4>[ 2288.954972][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff800287e000
..<snip>..
<4>[ 2289.092411][0] KDKIM: zcomp_lz4_create: 24: ret = (null)
<4>[ 2289.092546][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff80028b5000
..<snip>..
<4>[ 2289.135628][0] KDKIM: zcomp_lz4_create: 24: ret = (null)
<4>[ 2289.135642][0] KDKIM: zcomp_lz4_create: 24: ret = (null)
<4>[ 2289.135729][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff80028be000
<4>[ 2289.135732][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff80028c7000

Thanks,
Kyeongdon Kim
>
>>
>> Thanks,
>> Kyeongdon Kim

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/