Re: [PATCH v2 0/2] sched/core: Fix proxy-exec/core-sched interactions
From: Vasily Gorbik
Date: Fri May 15 2026 - 13:03:57 EST
On Tue, May 12, 2026 at 05:48:19PM -0700, John Stultz wrote:
> On Tue, May 12, 2026 at 2:17 PM John Stultz <jstultz@xxxxxxxxxx> wrote:
> > On Thu, May 7, 2026 at 3:42 AM Vasily Gorbik <gor@xxxxxxxxxxxxx> wrote:
> > > The v1 reported the issue reproduced on s390 LPAR, but it seems to be
> > > easily reproducible with strace test suite "make -j$(nproc) check" on
> > > any system with SMT, CONFIG_SCHED_CORE=y and CONFIG_SCHED_PROXY_EXEC=y
> > > enabled, e.g. on x86 KVM with -smp cpus=16,sockets=1,cores=8,threads=2:
> > >
> > I really appreciate this reproducer detail, but I've so far not been
> > able to trip this issue up (SCHED_CORE=y, SCHED_PROXY_EXEC=y and using
> > the qemu arguments you included above). Could you mail me your .config
> > in case something else is needed?
>
> Ok, I think I was able to force it using my priority-inversion-demo by
> taking the spots in the run.sh script where we kick off the
> rename-test and prefixing it with `coresched new -t pid --`
> https://github.com/johnstultz-work/priority-inversion-demo/blob/main/run.sh#L89
>
> That way the foreground/background tasks run with separate cookies and
> that forces proxying across cookies, and with that I've tripped over
> the issues you highlight.
>
> That said, I'm still curious to learn more about your x86 environment
> and why it tripped so much more easily there, so let me know.
I retried the repro on commit 66182ca873a4 (yesterday's Linus master)
with the same "make -j$(nproc) check".
The claim of "easily reproducible" on x86 KVM with
-smp cpus=16,sockets=1,cores=8,threads=2 and "JUST" CONFIG_SCHED_CORE=y and
CONFIG_SCHED_PROXY_EXEC=y was an overstatement for x86.
But it triggers with at least 50% probability in KVM on my machine with the
config attached. I don't have any large x86 machine available to me,
so my setup is a laptop with an i7-1360P, Fedora 43 on host and guest,
plus the latest strace git.
Compared with the x86 defconfig + CONFIG_SCHED_CORE=y and
CONFIG_SCHED_PROXY_EXEC=y, my best guess is that PREEMPT=y and
PROVE_LOCKING=y might cause the issue to trigger more often.
With just
CONFIG_SCHED_CORE=y
CONFIG_SCHED_PROXY_EXEC=y
PREEMPT=y
I got only 2/10 repro success rate.
On s390 with 64 SMT-2 cores I've just triggered the problem 3/3 even with
arch/s390/configs/defconfig, which has:
CONFIG_PREEMPT_LAZY=y
CONFIG_SCHED_CORE=y
CONFIG_SCHED_PROXY_EXEC=y
and no debug options. I wouldn't expect anything particularly special
about s390, it's just the number of cores.
Attachment:
config.gz
Description: application/gzip