Re: [PATCH] Fix /proc/stat freezes (was [PATCH v15] "task_isolation" mode)

From: Chris Metcalf
Date: Sat Aug 20 2016 - 04:18:40 EST


On 8/17/2016 3:37 PM, Christoph Lameter wrote:
On Tue, 16 Aug 2016, Chris Metcalf wrote:

- Dropped Christoph Lameter's patch to avoid scheduling the
clocksource watchdog on nohz cores; the recommendation is to just
boot with tsc=reliable for NOHZ in any case, if necessary.
We also said that there should be a WARN_ON if tsc=reliable is not
specified and processors are put into NOHZ mode. This is something not
obvious causing scheduling events on NOHZ processors.

Yes, I agree. Frederic said he would queue a patch to do that, so I
didn't want to propose another patch that would conflict.

Frederic, do you have a sense of what is left to be done there?
I can certainly try to contribute to that effort as well.
Here is a potential fix to the problem that /proc/stat values freeze when
processors go into NOHZ busy mode. I'd like to hear what people think
about the approach here. In particular one issue may be that I am
accessing remote tick-sched structures without serialization. But for
top/ps this may be ok. I noticed that other values shown by top/os also
sometime are a bit fuzzy.

This seems pretty plausible to me, but I'm not an expert on what kind
of locking might be required for these data structures.

--
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com