Re: [PATCH 0/5] sched: Lazy preemption muck

From: Ankur Arora
Date: Wed Oct 09 2024 - 03:24:28 EST



Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> writes:

> On 2024-10-08 21:40:05 [-0700], Ankur Arora wrote:
>> > While comparing this vs what I have:
>> > - need_resched()
>> > It checked both (tif_need_resched_lazy() || tif_need_resched()) while
>> > now it only looks at tif_need_resched().
>> > Also ensured that raw_irqentry_exit_cond_resched() does not trigger on
>> > lazy.
>> > I guess you can argue both ways what makes sense, just noting…
>>
>> I think we want need_resched() to be only tif_need_resched(). That way
>> preemption in lazy mode *only* happens at the user mode boundary.
>
> There are places such as __clear_extent_bit() or select_collect() where
> need_resched() is checked and if 0 they loop again. For these kind of
> users it would probably make sense to allow them to preempt themself.
> We could also add a new function which checks both and audit all users
> and check what would make sense base on $criteria.

Yeah, I remember having the same thought. But the problem is that the
need_resched() checks are all over the kernel. And, figuring out a good
criteria for each of them seems like it might be similar to the
placement problem for cond_resched() -- both being workload dependent.

And, given that the maximum time in the lazy state is limited, it seems
like it'll be simplest to just circumscribe the time spent in the lazy
state by upgrading to TIF_NEED_RESCHED based on a some time limit.

That seems to do the job quite well, as Thomas' hog example showed:
https://lore.kernel.org/lkml/87jzshhexi.ffs@tglx/

--
ankur