Re: [PATCH 5/6] sched/proxy: Remove PROXY_WAKING
From: K Prateek Nayak
Date: Tue Jun 02 2026 - 01:22:34 EST
Hello John,
On 6/2/2026 2:02 AM, John Stultz wrote:
> On Mon, Jun 1, 2026 at 3:54 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> On Tue, May 26, 2026 at 01:16:14PM +0200, Peter Zijlstra wrote:
>>> From: K Prateek Nayak <kprateek.nayak@xxxxxxx>
>>>
>>> Now that the proxy path uses ->is_blocked, use the '->is_blocked &&
>>> !->blocked_on' state instead of PROXY_WAKING. Notably, this is where a
>>> blocked_on relation is broken but the donor task might still need a return
>>> migration.
>>>
>>> (Not-yet-)Signed-off-by: K Prateek Nayak <kprateek.nayak@xxxxxxx>
>>> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
>>
>> Prateek, can I make that a normal SoB from you? I'm thinking I should
>> merge sched/proxy into sched/core so we can get on with other stuff.
>
> Just as a heads up, so in stress testing[1] over the weekend with your
> sched/proxy series, I hit the below null ptr traversal that seems to
> be another pick_eevdf() returning null issue.
>
> I'm not sure if this is proxy related or not yet, so I'll be working
> to reproduce (took ~31 hours to trip this one) and narrow it down.
> But I'm wondering, given this pick_eevdf() returning null symptom has
> been a regular issue for various bugs over time, do we need some
> better debug checks to try to better these narrow down?
I think PARANOID_AVG sched feat allows for some indication if things
have gone sideways without crashing but there isn't an easy way to get
the cfs_rq state which led to the crash without a crash kernel.
>
> This was using your tree at 4d92e41a046d, plus one workaround for
> binutils on my system:
> https://lore.kernel.org/lkml/7b45d196-063e-4e76-b08b-ec2bcc111328@xxxxxxxxxxxxx/
Could you also try merging tip:sched/urgent into this branch and
rerunning.
commit b6eee96843e8 ("sched/fair: Fix overflow in
vruntime_eligible()") in v7.1-rc3 moved to using 128-bit data type for
the eligibility check and it can catch cases where an overflow in the
multiplication will cause all entities to appear ineligible.
--
Thanks and Regards,
Prateek