Re: [RFC PATCH 0/4] Scheduler time slice extension
From: Prakash Sangappa
Date: Wed Nov 13 2024 - 15:11:16 EST
> On Nov 13, 2024, at 11:36 AM, Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
>
> On 2024-11-13 13:50, Peter Zijlstra wrote:
>> On Wed, Nov 13, 2024 at 12:01:22AM +0000, Prakash Sangappa wrote:
>>> This patch set implements the above mentioned 50us extension time as posted
>>> by Peter. But instead of using restartable sequences as API to set the flag
>>> to request the extension, this patch proposes a new API with use of a per
>>> thread shared structure implementation described below. This shared structure
>>> is accessible in both users pace and kernel. The user thread will set the
>>> flag in this shared structure to request execution time extension.
>> But why -- we already have rseq, glibc uses it by default. Why add yet
>> another thing?
>
> Indeed, what I'm not seeing in this RFC patch series cover letter is an
> explanation that justifies adding yet another per-thread memory area
> shared between kernel and userspace when we have extensible rseq
> already.
It mainly provides pinned memory, can be useful for future use cases where updating user memory in kernel context can be fast or needs to avoid pagefaults.
>
> Peter, was there anything fundamentally wrong with your approach based
> on rseq ? https://lore.kernel.org/lkml/20231030132949.GA38123@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
> The main thing I wonder is whether loading the rseq delay resched flag
> on return to userspace is too late in your patch. Also, I'm not sure it is
> realistic to require that no system calls should be done within time extension
> slice. If we have this scenario:
I am also not sure if we need to prevent system calls in this scenario.
Was that restriction mainly because of restartable sequence API implements it?
-Prakash
>
> A) userspace grabs lock
> - set rseq delay resched flag
> B) syscall
> - reschedule
> [...]
> - return to userspace, load rseq delay-resched flag from userspace (too late)
>
> I would have thought loading the delay resched flag should be attempted much
> earlier in the scheduler code. Perhaps we could do this from a page fault
> disable critical section, and accept that this hint may be a no-op if the
> rseq page happens to be swapped out (which is really unlikely). This is
> similar to the "on_cpu" sched state rseq extension RFC I posted a while back,
> which needed to be accessed from the scheduler:
>
> https://lore.kernel.org/lkml/20230517152654.7193-1-mathieu.desnoyers@xxxxxxxxxxxx/
> https://lore.kernel.org/lkml/20230529191416.53955-1-mathieu.desnoyers@xxxxxxxxxxxx/
>
> And we'd leave the delay-resched load in place on return to userspace, so
> in the unlikely scenario where it is swapped out, at least it gets paged
> back at that point.
>
> Feel free to let me know if I'm missing an important point and/or saying
> nonsense here.
>
> Thanks,
>
> Mathieu
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> https://www.efficios.com
>