On Tue, 2025-03-11 at 07:28 +0100, Gabriele Monaco wrote:
This patchset moves the task_mm_cid_work to a preemptible and
migratable
context. This reduces the impact of this work to the scheduling
latency
of real time tasks.
The change makes the recurrence of the task a bit more predictable.
The series was review and, in my opinion, is ready for inclusion.
Peter, Ingo, can we merge it?
Thanks,
Gabriele
The behaviour causing latency was introduced in commit 223baf9d17f2
("sched: Fix performance regression introduced by mm_cid") which
introduced a task work tied to the scheduler tick.
That approach presents two possible issues:
* the task work runs before returning to user and causes, in fact, a
scheduling latency (with order of magnitude significant in
PREEMPT_RT)
* periodic tasks with short runtime are less likely to run during the
tick, hence they might not run the task work at all
Patch 1 add support for prev_sum_exec_runtime to the RT, deadline and
sched_ext classes as it is supported by fair, this is required to
avoid
calling rseq_preempt on tick if the runtime is below a threshold.
Patch 2 contains the main changes, removing the task_work on the
scheduler tick and using a work_struct scheduled more reliably during
__rseq_handle_notify_resume.
Patch 3 adds a selftest to validate the functionality of the
task_mm_cid_work (i.e. to compact the mm_cids).
Changes since V11:
* Remove variable to make mm_cid_needs_scan more compact
* All patches reviewed
Changes since V10:
* Fix compilation errors with RSEQ and/or MM_CID disabled
Changes since V9:
* Simplify and move checks from task_queue_mm_cid to its call site
Changes since V8 [1]:
* Add support for prev_sum_exec_runtime to RT, deadline and sched_ext
* Avoid rseq_preempt on ticks unless executing for more than 100ms
* Queue the work on the unbound workqueue
Changes since V7:
* Schedule mm_cid compaction and update at every tick too
* mmgrab before scheduling the work
Changes since V6 [2]:
* Switch to a simple work_struct instead of a delayed work
* Schedule the work_struct in __rseq_handle_notify_resume
* Asynchronously disable the work but make sure mm is there while we
run
* Remove first patch as merged independently
* Fix commit tag for test
Changes since V5:
* Punctuation
Changes since V4 [3]:
* Fixes on the selftest
* Polished memory allocation and cleanup
* Handle the test failure in main
Changes since V3 [4]:
* Fixes on the selftest
* Minor style issues in comments and indentation
* Use of perror where possible
* Add a barrier to align threads execution
* Improve test failure and error handling
Changes since V2 [5]:
* Change the order of the patches
* Merge patches changing the main delayed_work logic
* Improved self-test to spawn 1 less thread and use the main one
instead
Changes since V1 [6]:
* Re-arm the delayed_work at each invocation
* Cancel the work synchronously at mmdrop
* Remove next scan fields and completely rely on the delayed_work
* Shrink mm_cid allocation with nr thread/affinity (Mathieu
Desnoyers)
* Add self test
[1] -
https://lore.kernel.org/lkml/20250220102639.141314-1-gmonaco@xxxxxxxxxx
[2] -
https://lore.kernel.org/lkml/20250210153253.460471-1-gmonaco@xxxxxxxxxx
[3] -
https://lore.kernel.org/lkml/20250113074231.61638-4-gmonaco@xxxxxxxxxx
[4] -
https://lore.kernel.org/lkml/20241216130909.240042-1-gmonaco@xxxxxxxxxx
[5] -
https://lore.kernel.org/lkml/20241213095407.271357-1-gmonaco@xxxxxxxxxx
[6] -
https://lore.kernel.org/lkml/20241205083110.180134-2-gmonaco@xxxxxxxxxx
To: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
To: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
To: Ingo Molnar <mingo@xxxxxxxxxx>
To: Paul E. McKenney <paulmck@xxxxxxxxxx>
To: Shuah Khan <shuah@xxxxxxxxxx>
Gabriele Monaco (3):
sched: Add prev_sum_exec_runtime support for RT, DL and SCX classes
sched: Move task_mm_cid_work to mm work_struct
selftests/rseq: Add test for mm_cid compaction
include/linux/mm_types.h | 17 ++
include/linux/rseq.h | 13 ++
include/linux/sched.h | 7 +-
kernel/rseq.c | 2 +
kernel/sched/core.c | 43 ++--
kernel/sched/deadline.c | 1 +
kernel/sched/ext.c | 1 +
kernel/sched/rt.c | 1 +
kernel/sched/sched.h | 2 -
tools/testing/selftests/rseq/.gitignore | 1 +
tools/testing/selftests/rseq/Makefile | 2 +-
.../selftests/rseq/mm_cid_compaction_test.c | 200
++++++++++++++++++
12 files changed, 258 insertions(+), 32 deletions(-)
create mode 100644
tools/testing/selftests/rseq/mm_cid_compaction_test.c
base-commit: 80e54e84911a923c40d7bee33a34c1b4be148d7a