[REGRESSION] Re: [PATCH 00/24] Complete EEVDF

From: Doug Smythies
Date: Sun Dec 29 2024 - 17:51:51 EST


Hi Peter,

I have been having trouble with turbostat reporting processor package power levels that can not possibly be true.
After eliminating the turbostat program itself as the source of the issue I bisected the kernel.
An edited summary (actual log attached):

82e9d0456e06 sched/fair: Avoid re-setting virtual deadline on 'migrations'
b10 bad fc1892becd56 sched/eevdf: Fixup PELT vs DELAYED_DEQUEUE
b13 bad 54a58a787791 sched/fair: Implement DELAY_ZERO
skip 152e11f6df29 sched/fair: Implement delayed dequeue
skip e1459a50ba31 sched: Teach dequeue_task() about special task states
skip a1c446611e31 sched,freezer: Mark TASK_FROZEN special
skip 781773e3b680 sched/fair: Implement ENQUEUE_DELAYED
skip f12e148892ed sched/fair: Prepare pick_next_task() for delayed dequeue
skip 2e0199df252a sched/fair: Prepare exit/cleanup paths for delayed_dequeue
b12 good e28b5f8bda01 sched/fair: Assert {set_next,put_prev}_entity() are properly balanced
dfa0a574cbc4 sched/uclamg: Handle delayed dequeue
b11 good abc158c82ae5 sched: Prepare generic code for delayed dequeue
e8901061ca0c sched: Split DEQUEUE_SLEEP from deactivate_task()

Where "bN" is just my assigned kernel name for each bisection step.

In the linux-kernel email archives I found a thread that isolated these same commits.
It was from late Novermebr / early December:

https://lore.kernel.org/all/20240727105030.226163742@xxxxxxxxxxxxx/T/#m9aeb4d897e029cf7546513bb09499c320457c174

An example of the turbostat manifestation of the issue:

doug@s19:~$ sudo ~/kernel/linux/tools/power/x86/turbostat/turbostat --quiet --Summary --show
Busy%,Bzy_MHz,IRQ,PkgWatt,PkgTmp,TSC_MHz --interval 1
[sudo] password for doug:
Busy% Bzy_MHz TSC_MHz IRQ PkgTmp PkgWatt
99.76 4800 4104 12304 73 80.08
99.76 4800 4104 12047 73 80.23
99.76 4800 879 12157 73 11.40
99.76 4800 26667 84214 72 557.23
99.76 4800 4104 12036 72 79.39

Where TSC_MHz was reported as 879, there was a big gap in time.
Like 4.7 seconds instead of 1.
Where TSC_MHz was reported as 26667, there was not a big gap in time.

It happens for about 5% of the samples + or - a lot.
It only happens when the workload is almost exactly 100%.
More load, it doesn't occur.
Less load, it doesn't occur. Although, I did get this once:

Busy% Bzy_MHz TSC_MHz IRQ PkgTmp PkgWatt
91.46 4800 4104 11348 73 103.98
91.46 4800 4104 11353 73 103.89
91.50 4800 3903 11339 73 98.16
91.43 4800 4271 12001 73 108.52
91.45 4800 4148 11481 73 105.13
91.46 4800 4104 11341 73 103.96
91.46 4800 4104 11348 73 103.99

So, it might just be much less probable and less severe.

It happens over many different types of workload that I have tried.

Processor: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
6 cores. 2 thread per core, 12 CPUs.
OS: Ubuntu 24.04.1 LTS (server, no GUI)

... Doug

doug@s19:~/kernel/linux$ git bisect bad
There are only 'skip'ped commits left to test.
The first bad commit could be any of:
781773e3b68031bd001c0c18aa72e8470c225ebd
a1c446611e31ca5363d4db51e398271da1dce0af
e1459a50ba31831efdfc35278023d959e4ba775b
f12e148892ede8d9ee82bcd3e469e6d01fc077ac
152e11f6df293e816a6a37c69757033cdc72667d
2e0199df252a536a03f4cb0810324dff523d1e79
54a58a78779169f9c92a51facf6de7ce94962328
We cannot bisect more!

doug@s19:~/kernel/linux$ git bisect log
git bisect start
# status: waiting for both good and bad commits
# good: [98f7e32f20d28ec452afb208f9cffc08448a2652] Linux 6.11
git bisect good 98f7e32f20d28ec452afb208f9cffc08448a2652
# status: waiting for bad commit, 1 good commit known
# bad: [9852d85ec9d492ebef56dc5f229416c925758edc] Linux 6.12-rc1
git bisect bad 9852d85ec9d492ebef56dc5f229416c925758edc
# good: [176000734ee2978121fde22a954eb1eabb204329] Merge tag 'ata-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
git bisect good 176000734ee2978121fde22a954eb1eabb204329
# bad: [d0359e4ca0f26aaf3118124dfb562e3b3dca1c06] Merge tag 'fs_for_v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
git bisect bad d0359e4ca0f26aaf3118124dfb562e3b3dca1c06
# bad: [171754c3808214d4fd8843eab584599a429deb52] Merge tag 'vfs-6.12.blocksize' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
git bisect bad 171754c3808214d4fd8843eab584599a429deb52
# good: [e55ef65510a401862b902dc979441ea10ae25c61] Merge tag 'amd-drm-next-6.12-2024-08-26' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
git bisect good e55ef65510a401862b902dc979441ea10ae25c61
# good: [32bd3eb5fbab954e68adba8c0b6a43cf03605c93] Merge tag 'drm-intel-gt-next-2024-09-06' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
git bisect good 32bd3eb5fbab954e68adba8c0b6a43cf03605c93
# good: [726e2d0cf2bbc14e3bf38491cddda1a56fe18663] Merge tag 'dma-mapping-6.12-2024-09-19' of git://git.infradead.org/users/hch/dma-mapping
git bisect good 726e2d0cf2bbc14e3bf38491cddda1a56fe18663
# good: [839c4f596f898edc424070dc8b517381572f8502] Merge tag 'mm-hotfixes-stable-2024-09-19-00-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
git bisect good 839c4f596f898edc424070dc8b517381572f8502
# bad: [bd9bbc96e8356886971317f57994247ca491dbf1] sched: Rework dl_server
git bisect bad bd9bbc96e8356886971317f57994247ca491dbf1
# good: [863ccdbb918a77e3f011571f943020bf7f0b114b] sched: Allow sched_class::dequeue_task() to fail
git bisect good 863ccdbb918a77e3f011571f943020bf7f0b114b
# bad: [fc1892becd5672f52329a75c73117b60ac7841b7] sched/eevdf: Fixup PELT vs DELAYED_DEQUEUE
git bisect bad fc1892becd5672f52329a75c73117b60ac7841b7
# skip: [2e0199df252a536a03f4cb0810324dff523d1e79] sched/fair: Prepare exit/cleanup paths for delayed_dequeue
git bisect skip 2e0199df252a536a03f4cb0810324dff523d1e79
# skip: [f12e148892ede8d9ee82bcd3e469e6d01fc077ac] sched/fair: Prepare pick_next_task() for delayed dequeue
git bisect skip f12e148892ede8d9ee82bcd3e469e6d01fc077ac
# skip: [e1459a50ba31831efdfc35278023d959e4ba775b] sched: Teach dequeue_task() about special task states
git bisect skip e1459a50ba31831efdfc35278023d959e4ba775b
# skip: [781773e3b68031bd001c0c18aa72e8470c225ebd] sched/fair: Implement ENQUEUE_DELAYED
git bisect skip 781773e3b68031bd001c0c18aa72e8470c225ebd
# good: [abc158c82ae555078aa5dd2d8407c3df0f868904] sched: Prepare generic code for delayed dequeue
git bisect good abc158c82ae555078aa5dd2d8407c3df0f868904
# skip: [a1c446611e31ca5363d4db51e398271da1dce0af] sched,freezer: Mark TASK_FROZEN special
git bisect skip a1c446611e31ca5363d4db51e398271da1dce0af
# good: [e28b5f8bda01720b5ce8456b48cf4b963f9a80a1] sched/fair: Assert {set_next,put_prev}_entity() are properly balanced
git bisect good e28b5f8bda01720b5ce8456b48cf4b963f9a80a1
# skip: [152e11f6df293e816a6a37c69757033cdc72667d] sched/fair: Implement delayed dequeue
git bisect skip 152e11f6df293e816a6a37c69757033cdc72667d
# bad: [54a58a78779169f9c92a51facf6de7ce94962328] sched/fair: Implement DELAY_ZERO
git bisect bad 54a58a78779169f9c92a51facf6de7ce94962328
# only skipped commits left to test
# possible first bad commit: [54a58a78779169f9c92a51facf6de7ce94962328] sched/fair: Implement DELAY_ZERO
# possible first bad commit: [152e11f6df293e816a6a37c69757033cdc72667d] sched/fair: Implement delayed dequeue
# possible first bad commit: [e1459a50ba31831efdfc35278023d959e4ba775b] sched: Teach dequeue_task() about special task states
# possible first bad commit: [a1c446611e31ca5363d4db51e398271da1dce0af] sched,freezer: Mark TASK_FROZEN special
# possible first bad commit: [781773e3b68031bd001c0c18aa72e8470c225ebd] sched/fair: Implement ENQUEUE_DELAYED
# possible first bad commit: [f12e148892ede8d9ee82bcd3e469e6d01fc077ac] sched/fair: Prepare pick_next_task() for delayed dequeue
# possible first bad commit: [2e0199df252a536a03f4cb0810324dff523d1e79] sched/fair: Prepare exit/cleanup paths for delayed_dequeue

doug@s19:~/kernel/linux$ git log --oneline | grep -B 2 -A 10 54a58a78779
82e9d0456e06 sched/fair: Avoid re-setting virtual deadline on 'migrations'
b10 bad fc1892becd56 sched/eevdf: Fixup PELT vs DELAYED_DEQUEUE
b13 bad 54a58a787791 sched/fair: Implement DELAY_ZERO
skip 152e11f6df29 sched/fair: Implement delayed dequeue
skip e1459a50ba31 sched: Teach dequeue_task() about special task states
skip a1c446611e31 sched,freezer: Mark TASK_FROZEN special
skip 781773e3b680 sched/fair: Implement ENQUEUE_DELAYED
skip f12e148892ed sched/fair: Prepare pick_next_task() for delayed dequeue
skip 2e0199df252a sched/fair: Prepare exit/cleanup paths for delayed_dequeue
b12 good e28b5f8bda01 sched/fair: Assert {set_next,put_prev}_entity() are properly balanced
dfa0a574cbc4 sched/uclamg: Handle delayed dequeue
b11 good abc158c82ae5 sched: Prepare generic code for delayed dequeue
e8901061ca0c sched: Split DEQUEUE_SLEEP from deactivate_task()