Re: Bad psi_group_cpu.tasks[NR_MEMSTALL] counter

From: Max Kellermann
Date: Mon Aug 12 2024 - 04:07:02 EST


On Tue, Aug 6, 2024 at 5:56 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
> Hmm. The original scenario I was thinking about when I proposed this
> WARN_ON() was deemed impossible, so I think the only other possibility
> is that the task being killed somehow skipped psi_memstall_leave()
> before its death... Did you have the instrumentation I suggested to
> track imbalance between psi_memstall_enter()/psi_memstall_leave() and
> to record the _RET_IP_? If so, did it trigger at all?

No, unfortunately I did not have the instrumentation because I don't
know how this works (and didn't have the time to find out). If you
have a patch for me, I can merge it into our kernel fork so we have
the data next time it occurs.

Max