Re: [PATCH 4/5] sched_ext: Split the global DSQ per NUMA node

From: David Vernet
Date: Thu Sep 26 2024 - 17:56:56 EST

Next message: Miguel Ojeda: "Re: [PATCH v2 1/2] rust: add untrusted data abstraction"
Previous message: Miguel Ojeda: "Re: [PATCH v2 1/2] rust: add untrusted data abstraction"
In reply to: Tejun Heo: "[PATCH 4/5] sched_ext: Split the global DSQ per NUMA node"
Next in thread: Tejun Heo: "[PATCH 5/5] sched_ext: Use shorter slice while bypassing"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Sep 24, 2024 at 02:06:06PM -1000, Tejun Heo wrote:
> In the bypass mode, the global DSQ is used to schedule all tasks in simple
> FIFO order. All tasks are queued into the global DSQ and all CPUs try to
> execute tasks from it. This creates a lot of cross-node cacheline accesses
> and scheduling across the node boundaries, and can lead to live-lock
> conditions where the system takes tens of minutes to disable the BPF
> scheduler while executing in the bypass mode.
>
> Split the global DSQ per NUMA node. Each node has its own global DSQ. When a
> task is dispatched to SCX_DSQ_GLOBAL, it's put into the global DSQ local to
> the task's CPU and all CPUs in a node only consume its node-local global
> DSQ.
>
> This resolves a livelock condition which could be reliably triggered on an
> 2x EPYC 7642 system by running `stress-ng --race-sched 1024` together with
> `stress-ng --workload 80 --workload-threads 10` while repeatedly enabling
> and disabling a SCX scheduler.
>
> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>

Acked-by: David Vernet <void@xxxxxxxxxxxxx>

Attachment: signature.asc
Description: PGP signature

Next message: Miguel Ojeda: "Re: [PATCH v2 1/2] rust: add untrusted data abstraction"
Previous message: Miguel Ojeda: "Re: [PATCH v2 1/2] rust: add untrusted data abstraction"
In reply to: Tejun Heo: "[PATCH 4/5] sched_ext: Split the global DSQ per NUMA node"
Next in thread: Tejun Heo: "[PATCH 5/5] sched_ext: Use shorter slice while bypassing"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]