Re: [memcg] 0f12156dff: will-it-scale.per_process_ops -33.6% regression

From: Jens Axboe
Date: Tue Sep 07 2021 - 12:14:28 EST

Next message: Konstantin Komarov: "Re: [PATCH v3 0/9] fs/ntfs3: Use new mount api and change some opts"
Previous message: Eric Snowberg: "[PATCH v5 09/12] KEYS: link secondary_trusted_keys to machine trusted keys"
In reply to: Shakeel Butt: "Re: [memcg] 0f12156dff: will-it-scale.per_process_ops -33.6% regression"
Next in thread: Linus Torvalds: "Re: [memcg] 0f12156dff: will-it-scale.per_process_ops -33.6% regression"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 9/7/21 9:57 AM, Shakeel Butt wrote:
> On Tue, Sep 7, 2021 at 8:46 AM Jens Axboe <axboe@xxxxxxxxx> wrote:
>>
>> On 9/7/21 9:07 AM, kernel test robot wrote:
>>>
>>>
>>> Greeting,
>>>
>>> FYI, we noticed a -33.6% regression of will-it-scale.per_process_ops due to commit:
>>>
>>>
>>> commit: 0f12156dff2862ac54235fc72703f18770769042 ("memcg: enable accounting for file lock caches")
>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>
>> Are we at all worried about these? There's been a number of them
>> reported, basically for all the accounting enablements that have been
>> done in this merge window.
>>
>> When io_uring was switched to use accounted memory, we did a bunch of
>> work to ameliorate the inevitable slowdowns that happen if you do
>> repeated allocs and/or frees and have memcg accounting enabled.
>>
>
> I think these are important and we should aim to continuously improve
> performance with memcg accounting. I would like to know more about the
> io_uring work done to improve memcg accounting. Maybe we can
> generalize it to others as well.

It's pretty basic and may not be applicable to all cases, we simply hang
on to our allocations for longer periods and reuse them. Hence instead
of always going through alloc+free to each "unit", they are recycled and
reused until no longer needed.

Now this is more efficient in general for us, as we can have a very high
rate of requests (and hence allocs+frees). I suspect most use cases
would benefit from simply having a cache in front of memcg slabs, but
that seems like solving the issue at the wrong layer. IMHO it'd be
better to have the memcg accounting be done in batches, eg have some
notion of deferred frees. If someone allocates before the deferred frees
are accounted, then that would have saved two pieces of accounting.

It is of course possible that a lot of these regressions are simply
accounting the alloc, in which case it seems like accounting in batches
might help there. All depends on the slack that is acceptable for memcg.

--
Jens Axboe

Next message: Konstantin Komarov: "Re: [PATCH v3 0/9] fs/ntfs3: Use new mount api and change some opts"
Previous message: Eric Snowberg: "[PATCH v5 09/12] KEYS: link secondary_trusted_keys to machine trusted keys"
In reply to: Shakeel Butt: "Re: [memcg] 0f12156dff: will-it-scale.per_process_ops -33.6% regression"
Next in thread: Linus Torvalds: "Re: [memcg] 0f12156dff: will-it-scale.per_process_ops -33.6% regression"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]