RE: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

From: Doug Smythies
Date: Wed Jan 15 2025 - 11:47:18 EST

Next message: Keith Busch: "Re: [PATCH] KVM: x86: switch hugepage recovery thread to vhost_task"
Previous message: Krzysztof Kozlowski: "Re: [PATCH 17/19] media: dt-bindings: ti,ds90ub960: Add "i2c-addr" link property"
In reply to: Len Brown: "Re: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF"
Next in thread: Peter Zijlstra: "Re: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Len,

Thank you for chiming in on this thread.

On 2025.01.14 18:09 Len Brown wrote:
> Doug,
> Your attention to detail and persistence has once again found a tricky
> underlying bug -- kudos!
>
> Re: turbostat behaviour
>
> Yes, TSC_MHz -- "the measured rate of the TSC during an interval", is
> printed as a sanity check. If there are any irregularities in it, as
> you noticed, then something very strange in the hardware or software
> is going wrong (and the actual turbostat results will likely not be
> reliable).

While I use turbostat almost every day, I am embarrassed to admit
that until this investigation I did not know about the ability to
add the "Time_Of_Day_Seconds" and "usec" columns.
They have been incredibly useful.
Early on, and until I discovered those two "show" options, I was
using the sanity check calculation of TSC_MHz to reveal the anomaly.

> Yes, the "usec" column measures how long it takes to migrate to a CPU
> and collect stats there. So if you are hunting down a glitch in
> migration all you need is this column to see it. "usec" on the
> summary row is the difference between the 1st migration and after the
> last -- excluding the sysfs/procfs time that is consumed on the last
> CPU. So migration delays will also be reflected there.

On a per CPU basis, it excludes the actual CPU migration step.
Peter and I made a modification to turbostat to have the per CPU
'usec" column focus just on the CPU migration time. [1]

> Note: we have a patch queued which changes the "usec" on the Summary
> row to *include* the sysfs/procfs time on the last CPU.

I did not realise that is was just for the last "sysfs/procfs" time.
I'll take a closer look, and wonder if that can explain why I have been
unable to catch the lingering >= 10 mSec stuff.

> (The per-cpu
> "usec" values are unchanged.) This is because we've noticed some
> really weird delays in doing things like reading /proc/interrupts and
> we want to be able to easily do A/B comparisons by simply including or
> excluding counters.

Yes, I saw the patch email on the linux-pm email list and have included it
in my local turbostat for about a week now.

> Also FYI, The scheme of migrating to each CPU so that collecting stats
> there will be "local" isn't scaling so well on very large systems, and
> I'm about to take a close look at it. In yogini we used a different
> scheme, where a thread is bound to each CPU, so they can collect in
> parallel; and we may be moving to something like that.
>
> cheers,
> Len Brown, Intel Open Source Technology Center

[1] https://lore.kernel.org/lkml/001b01db608a$56d3dc40$047b94c0$@telus.net/

... Doug

Next message: Keith Busch: "Re: [PATCH] KVM: x86: switch hugepage recovery thread to vhost_task"
Previous message: Krzysztof Kozlowski: "Re: [PATCH 17/19] media: dt-bindings: ti,ds90ub960: Add "i2c-addr" link property"
In reply to: Len Brown: "Re: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF"
Next in thread: Peter Zijlstra: "Re: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]