Re: [PATCH v8 08/25] x86/resctrl: Introduce interface to display number of monitoring counters

From: Tony Luck
Date: Fri Oct 11 2024 - 17:37:03 EST


On Fri, Oct 11, 2024 at 03:49:48PM -0500, Moger, Babu wrote:
> > Is there some hardware limitation that would prevent
> > re-using domain 1 counters 2 & 3 for some other group (RMID)?
> >
> > Or is this just a s/w implementation detail because
> > you have a system wide allocator for counters?
> >
>
> There is no hardware limitation. It is how resctrl is designed.
> In case of Intel(with two sockets, 16 CLOSIDs), You can only create 16
> groups. Each group will have two domains(domain 0 for socket 0 and domain 1
> for socket 1).
>
> # cat schemata
> MB:0=100;1=100
> L3:0=ffff;1=ffff;
>
>
> We may have to think of addressing this sometime in the future.

In this example, the hardware would support using the instances
of counters 2 & 3 on socket 1 for a different group (RMID). But
your code doesn't alllow it because the instances of counters
2 & 3 are active on socket 0.

If you had a separate counter allocation pool for each domain
you would not have this limitation. When counters 2 & 3 are
freed on domain 1, they could be allocated to the domain 1
element of some other group.

Maybe that isn't an interesting use case, so not worth doing?

But if that is the goal, then there is no benefit in having
/sys/fs/resctrl/info/L3_MON/mbm_assign_control allow different
domains to choose different counter allocation policies.

E.g. in this example from Documentation:

/child_default_mon_grp/0=tl;1=l;

This group allocated two counters (because domain 0 is counting
both total and local). Domain 1 is only counting local, but
that means a counter on domain 1 is sitting idle. It can't
be used because the matching counter is active on domain 0.

I.e. the user who chose this simply gave up being able to
read total bandwidth on domain 1, but didn't get an extra
counter in exchange for this sacrifice. That doesn't seem
like a good deal.

I see two options for improvement:

1) Implement per-domain allocation of counters. Then a counter
freed in a domain becomes available for use in that domain
for other groups.

2) Go all-in on the global counter model and simplify the
syntax of mbm_assign_control to allocate the same counters
in all domains. That would simplify the parsing code.

-Tony