Re: [PATCH] memcg: sync flush only if periodic flush is delayed

From: Michal Koutný
Date: Mon Mar 14 2022 - 08:57:12 EST


Hi.

On Sat, Mar 12, 2022 at 07:07:15PM +0000, Shakeel Butt <shakeelb@xxxxxxxxxx> wrote:
> So, I will focus on the error rate in this email.

(OK, I'll stick to error estimate (for long-term) in this message and
will send another about the current patch.)

> [...]
>
> > The benefit this was traded for was the greater accuracy, the possible
> > error is:
> > - before
> > - O(nr_cpus * nr_cgroups(subtree) * MEMCG_CHARGE_BATCH) (1)
>
> Please note that (1) is the possible error for each stat item and
> without any time bound.

I agree (forgot to highlight this can stuck forever).

>
> > - after
> > O(nr_cpus * MEMCG_CHARGE_BATCH) // sync. flush
>
> The above is across all the stat items.

Can it be used to argue about the error?
E.g.
nr_cpus * MEMCG_CHARGE_BATCH / nr_counters
looks appealing but that's IMO too optimistic.

The individual item updates are correlated so in practice a single item
would see a lower error than my first relation but without delving too
much into correlations the upper bound is nr_counters independent.


> I don't get the reason of breaking 'cr' into individual stat item or
> counter. What is the benefit? We want to keep the error rate decoupled
> from the number of counters (or stat items).

It's just a model, it should capture that every stat item (change)
contributes to the common error estimate. (So it moves more towards the
nr_cpus * MEMCG_CHARGE_BATCH / nr_counters
per-item error (but here we're asking about processing time.))

[...]

> My main reason behind trying NR_MEMCG_EVENTS was to reduce flush_work by
> reducing nr_counters and I don't think nr_counters should have an impact
> on Δt.

The higher number of items is changing, the sooner they accumulate the
target error, no?

(Δt is not the periodic flush period, it's variable time between two
sync flushes.)

Michal