Re: [RFC PATCH v5 20/29] sched/deadline: Allow deeper hierarchies of RT cgroups

From: Yuri Andriaccio

Date: Mon May 18 2026 - 11:31:44 EST


Hi Tejun,

me and Luca were discussing further the interface for HCBS and agreed that from the whole discussion we seem to be converging to something stable. To clarify things, I'm writing a recap here and I'll also add some visual examples to make sure every "corner case" we might encounter can be managed.

Interface:
- File: cpu.rt.max
- Format: <runtime>|"max" <period>
- Default value:
    "max" <parent period> - if the parent schedules on the root runqueue.
    0 <parent period> - if the parent is instead using HCBS.
- Meaning (incomplete/dubious):
    The bandwidth allocated to the specific cgroup and all of its children.
    Since sum(children bw) <= own bw, a cgroup's servers will be configured
    with (own bw - sum(children bw)) bandwidth.
    A cgroup set to "max" whose parents are all set to "max" (root cgroup excluded)
    will run their tasks in the root runqueue.
    A cgroup set to "max" whose parent has a non-zero reservation will
    inherit the parent's configuration.
    The root cgroup's cpu.rt.max file reserves the maximum HCBS bandwidth for
    the whole hierarchy. Root set to "max" disable HCBS (as if set with a zero runtime).


Corner Cases with Examples:
                Root (70 100)
                      |
                  G2(50 100)
                /     |      \
   G3("max" 100)  G4(20 100)  ---G6---
                      |
                   ---G5---

In this example, only the root cgroup may run tasks in the root runqueue, groups G2, G3 and G4 use HCBS instead. G5 and G6 are freshly allocated.
- Is this a valid configuration?
- What bandwidth does G3 take?
    - Do tasks in G3 share the same runqueue of G2?
- Are the defaults mentioned above suitable?
  Otherwise, what default settings do G5 and G6 take?

          Root (70 100)
         /             \
    G0 ("max" 0)    G2(50 100)
       /           /          \
G1 ("max" 0)  G3(10 100)   G4(20 100)

Groups Root, G0 and G1 may run tasks in the root runqueue, groups G2, G3 and G4 use HCBS instead.
- Is this a valid configuration?
- Is it possible to set G0 to (20 100)?
- Is it possible to set G2 to ("max" 0)?


           Root (70 100)
                |
           G0 ("max" 0)
         /             \
    G1 ("max" 0)    G2(50 100)
                   /          \
              G3(10 100)   G4(20 100)

Groups Root, G0 and G1 may run tasks in the root runqueue, groups G2, G3 and G4 use HCBS instead.
- Is this a valid configuration?
- Is it possible to set G1 to (20 100)?
- What happens when we set G2 to "max"?
    - Shall this be disallowed, as G2's children have a reservation?
    - Are they silently reverted to "max"?
    - Or G3 and G4 are kept with their reservations?
- What happens when we set G0 to (70 100)?
    - If allowed, what if we set G0 to (40 100)?


I think here we have most of the corner cases that are still of doubt for us for the final implementation.

Thanks in advance,
Yuri