Re: [PATCH 0/3] allow zram to use zbud as underlying allocator

From: Vlastimil Babka
Date: Wed Oct 14 2015 - 09:28:53 EST


On 10/10/2015 11:33 AM, Vitaly Wool wrote:
> On Thu, Oct 1, 2015 at 9:52 AM, Vlastimil Babka <vbabka@xxxxxxx> wrote:
>> On 09/30/2015 05:46 PM, Vitaly Wool wrote:
>>>
>>> On Wed, Sep 30, 2015 at 5:37 PM, Vlastimil Babka <vbabka@xxxxxxx> wrote:
>>>>
>>>> On 09/25/2015 11:54 AM, Vitaly Wool wrote:
>>>>>
>>>>>
>>>>> Hello Minchan,
>>>>>
>>>>> the main use case where I see unacceptably long stalls in UI with
>>>>> zsmalloc is switching between users in Android.
>>>>> There is a way to automate user creation and switching between them so
>>>>> the test I run both to get vmstat statistics and to profile stalls is
>>>>> to create a user, switch to it and switch back. Each test cycle does
>>>>> that 10 times, and all the results presented below are averages for 20
>>>>> runs.
>>>>>
>>>>> Kernel configurations used for testing:
>>>>>
>>>>> (1): vanilla
>>>>> (2): (1) plus "make SLUB atomic" patch [1]
>>>>> (3): (1) with zbud instead of zsmalloc
>>>>> (4): (2) with compaction defer logic mostly disabled
>>>>
>>>>
>>>>
>>>> Disabling compaction deferring leads to less compaction stalls? That
>>>> indeed
>>>> looks very weird and counter-intuitive. Also what's "mostly" disabled
>>>> mean?
>>>
>>>
>>> Not that I'm not surprised myself. However, this is how it goes.
>>> Namely, I reverted the following patches:
>>> - mm, compaction: defer each zone individually instead of preferred zone
>>
>>
>> Oh, I see. Then you didn't disable compaction defer logic, but made it
>> coarse again instead of per-zone. Which means that an allocation that can be
>> satisfied from Normal zone will use the Normal zone's deferred state to
>> decide whether to compact also DMA and DMA32 zones *within the same
>> allocation attempt*. So by reverting the patch you might indeed get less
>> compact_stall (and success+failure) counts, but each stall will try to
>> compact all three zones. With individual defer, some stall might be just for
>> DMA32, some just for Normal, and the total number might be higher, but the
>> compaction overhead should be better distributed among all the attempts.
>
> The thing is, this happens on an ARM64 and I only have one zone there.

Hmm, then it shouldn't make a difference... unless there's a bug.

>> Looking at your latencies, looks like that's working fine:
>>
>>>
>>> The UI is blocked after user switching for, average:
>>> (1) 1.84 seconds
>>> (2) 0.89 seconds
>>> (3) 1.32 seconds
>>> (4) 0.87 seconds
>>
>>
>> Average for (2) vs (4) is roughly the same, I would guess within noise.
>
> That I surely won't argue with :)
>
>>> The UI us blocked after user switching for, worst-case:
>>> (1) 2.91
>>> (2) 1.12
>>> (3) 1.79
>>> (4) 1.34
>>
>>
>> The worst case is actually worse without individual defer, because you end
>> up compacting all zones in each single stall. With individual defer, there's
>> a low probability of that happening.
>
> Okay, but in case of a single zone, isn't this more fine-grained logic
> resulting in more defers and less async compactions?

In case of single zone, it has only the single zone to consider with or without
the patch, so the result should be the same.

>>> - mm, compaction: embed migration mode in compact_control
>>
>>
>> This probably affects just THPs.
>>
>>> - mm, compaction: add per-zone migration pfn cache for async compaction
>>
>>
>> Hard to say what's the effect of this.
>>
>>> - ïmm: compaction: encapsulate defer reset logic
>>
>>
>> This is just code consolidation.
>>
>>> ~vitaly
>>>
>>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/