Re: [PATCH 0/3] allow zram to use zbud as underlying allocator

From: Vitaly Wool
Date: Sat Oct 10 2015 - 05:34:20 EST


On Thu, Oct 1, 2015 at 9:52 AM, Vlastimil Babka <vbabka@xxxxxxx> wrote:
> On 09/30/2015 05:46 PM, Vitaly Wool wrote:
>>
>> On Wed, Sep 30, 2015 at 5:37 PM, Vlastimil Babka <vbabka@xxxxxxx> wrote:
>>>
>>> On 09/25/2015 11:54 AM, Vitaly Wool wrote:
>>>>
>>>>
>>>> Hello Minchan,
>>>>
>>>> the main use case where I see unacceptably long stalls in UI with
>>>> zsmalloc is switching between users in Android.
>>>> There is a way to automate user creation and switching between them so
>>>> the test I run both to get vmstat statistics and to profile stalls is
>>>> to create a user, switch to it and switch back. Each test cycle does
>>>> that 10 times, and all the results presented below are averages for 20
>>>> runs.
>>>>
>>>> Kernel configurations used for testing:
>>>>
>>>> (1): vanilla
>>>> (2): (1) plus "make SLUB atomic" patch [1]
>>>> (3): (1) with zbud instead of zsmalloc
>>>> (4): (2) with compaction defer logic mostly disabled
>>>
>>>
>>>
>>> Disabling compaction deferring leads to less compaction stalls? That
>>> indeed
>>> looks very weird and counter-intuitive. Also what's "mostly" disabled
>>> mean?
>>
>>
>> Not that I'm not surprised myself. However, this is how it goes.
>> Namely, I reverted the following patches:
>> - mm, compaction: defer each zone individually instead of preferred zone
>
>
> Oh, I see. Then you didn't disable compaction defer logic, but made it
> coarse again instead of per-zone. Which means that an allocation that can be
> satisfied from Normal zone will use the Normal zone's deferred state to
> decide whether to compact also DMA and DMA32 zones *within the same
> allocation attempt*. So by reverting the patch you might indeed get less
> compact_stall (and success+failure) counts, but each stall will try to
> compact all three zones. With individual defer, some stall might be just for
> DMA32, some just for Normal, and the total number might be higher, but the
> compaction overhead should be better distributed among all the attempts.

The thing is, this happens on an ARM64 and I only have one zone there.

> Looking at your latencies, looks like that's working fine:
>
>>
>> The UI is blocked after user switching for, average:
>> (1) 1.84 seconds
>> (2) 0.89 seconds
>> (3) 1.32 seconds
>> (4) 0.87 seconds
>
>
> Average for (2) vs (4) is roughly the same, I would guess within noise.

That I surely won't argue with :)

>> The UI us blocked after user switching for, worst-case:
>> (1) 2.91
>> (2) 1.12
>> (3) 1.79
>> (4) 1.34
>
>
> The worst case is actually worse without individual defer, because you end
> up compacting all zones in each single stall. With individual defer, there's
> a low probability of that happening.

Okay, but in case of a single zone, isn't this more fine-grained logic
resulting in more defers and less async compactions?

>> - mm, compaction: embed migration mode in compact_control
>
>
> This probably affects just THPs.
>
>> - mm, compaction: add per-zone migration pfn cache for async compaction
>
>
> Hard to say what's the effect of this.
>
>> - ïmm: compaction: encapsulate defer reset logic
>
>
> This is just code consolidation.
>
>> ~vitaly
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/