Re: Bad psi_group_cpu.tasks[NR_MEMSTALL] counter

From: Max Kellermann
Date: Wed Jun 12 2024 - 06:21:02 EST


On Wed, Jun 12, 2024 at 11:49 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> The erofs one is also not entirely obvious, but irrelevant if you're not using
> it... the below should make it a little more obvious, but what do I know.

We do use erofs a lot, and I read that very function the other day -
it is weird code with two loop levels plus continue and even goto; but
I thought it was okay. psi_memstall_enter() is only ever called if
bio!=NULL, and the function takes care to call psi_memstall_leave()
when NULLing bio. Therefore I think your patch is not necessary (but
adds a tiny bit of overhead). What do I miss?

> Best case would be if you could somehow find a reproducer, but
> I realize this might be tricky.

Oh, I wish. I tried for several days, adding artificial delays
everywhere, in order to make some race more likely; I created and
deleted millions of cgroups and killed just as many processes under
(artificial) memory pressure, but nothing.

Max