Re: [RFC] psi: Add additional PSI counters for each type of memory pressure

From: Johannes Weiner
Date: Wed Nov 10 2021 - 11:44:34 EST


On Wed, Nov 10, 2021 at 07:36:37AM -0800, Georgi Djakov wrote:
> From: Carlos Ramirez <carlrami@xxxxxxxxxxxxxx>
>
> Calculates psi totals for memory pressure subevents:
> compaction, thrashing, direct compaction, direct reclaim, and kswapd0.
> Uses upper 16 bits of psi_flags to track memory subevents.

Oof, that's quite heavy both in terms of branches, but also in terms
of cache - which, depending on wakeup pattern and cpu topology can
really hurt those paths.

What's the usecase? Do you have automation that needs to act on one
type of stall but not the others, for example?

I find that looking at vmstat events on hosts with elevated pressure
tends to give a pretty good idea of the source. It should also be
possible to whip up a short bpftrace script to track down culprit
callstacks of psi_memstall_*.

> @@ -1053,19 +1128,56 @@ int psi_show(struct seq_file *m, struct psi_group *group, enum psi_res res)
> + seq_printf(m, "%s avg10=%lu.%02lu avg60=%lu.%02lu avg300=%lu.%02lu total=%llu %llu %llu %llu %llu %llu %llu %llu %llu %llu %llu %llu\n",
> full ? "full" : "some",
> LOAD_INT(avg[0]), LOAD_FRAC(avg[0]),
> LOAD_INT(avg[1]), LOAD_FRAC(avg[1]),
> LOAD_INT(avg[2]), LOAD_FRAC(avg[2]),
> - total);
> + total, total_blk_cgroup_throttle, total_bio, total_compaction,
> + total_thrashing, total_cgroup_reclaim_high,
> + total_cgroup_reclaim_high_sleep, total_cgroup_try_charge,
> + total_direct_compaction, total_direct_reclaim, total_read_swappage,
> + total_kswapd);

The file format is a can of worms. I doubt we can change this at this
point without breaking parsers, so those numbers would have to live
somewhere else. But let's figure out the above questions before
worrying about this.