Re: [patch V3 8/8] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean

From: Marcelo Tosatti
Date: Wed Sep 01 2021 - 13:38:21 EST


On Wed, Sep 01, 2021 at 09:05:55AM -0400, Nitesh Lal wrote:
> Hi Marcelo,
>
> On Tue, Aug 24, 2021 at 11:42 AM Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:
> >
> > It is not necessary to queue work item to run refresh_vm_stats
> > on a remote CPU if that CPU has no dirty stats and no per-CPU
> > allocations for remote nodes.
> >
> > This fixes sosreport hang (which uses vmstat_refresh) with
> > spinning SCHED_FIFO process.
> >
>
> I was still able to reproduce the sosreport hang with this patchset and I
> am wondering if that is because right now we do vmstat_sync and then cancel
> any pending jobs on a CPU in the context of one task.

Hi Nitesh,

Did you use chisol (with proper flags) and the modified oslat?

Tested with "echo 1 > /proc/sys/vmstat_refresh" and it was successful
(no hangs).

> However, while this task is running another process can come in and can
> dirty the stats resulting in vmstat job getting placed on CPUs running
> SCHED_FIFO tasks.
> Am I missing something?
> What we can probably do is to communicate that a CPU is running on task
> isolation mode to any other process that is trying to run and schedule
> jobs there.

No, that can happen. Can use sched notifiers to handle this problem.
Good point.