Re: [PATCH v11 22/23] x86/resctrl: Introduce interface to list assignment states of all the groups
From: Dave Martin
Date: Mon Feb 24 2025 - 12:20:23 EST
On Fri, Feb 21, 2025 at 12:10:44PM -0800, Reinette Chatre wrote:
> Hi Dave,
>
> On 2/21/25 8:00 AM, Dave Martin wrote:
> > On Thu, Feb 20, 2025 at 03:29:12PM -0600, Moger, Babu wrote:
> >> Hi Dave,
> >>
> >> On 2/20/25 09:44, Dave Martin wrote:
[...]
> >>> But mbm_assign_control data is dynamically generated and potentially
> >>> much bigger than a typical sysfs file.
> >>
> >> I have no idea how to handle this case. We may have to live with this
> >> problem. Let us know if there are any ideas.
> >
> > I think the current implication is that this will work for now provided
> > that the generated text fits in a page.
> >
> >
> > Reinette, what's your view on accepting this limitation in the interest
> > of stabilising this series, and tidying up this corner case later?
> >
> > As for writes to this file, we're unlikely to hit the limit unless
> > there are a lot of RMIDs available and many groups with excessively
> > long names.
>
> I am more concerned about reads to this file. If only 4K writes are
> supported then user space can reconfigure the system in page sized
> portions. It may not be efficient if the user wants to reconfigure the
> entire system but it will work. The problem with reads is that if
> larger than 4K reads are required but not supported then it will be
> impossible for user space to learn state of the system.
>
> We may already be at that limit. Peter described [1] how AMD systems
> already have 32 L3 monitoring domains and 256 RMIDs. With conservative
> resource group names of 10 characters I see one line per monitoring group
> that could look like below and thus easily be above 200 characters:
>
> resgroupAA/mongroupAA/0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;8=tl;9=tl;10=tl;11=tl;12=tl;13=tl;14=tl;15=tl;16=tl;17=tl;18=tl;19=tl;20=tl;21=tl;22=tl;23=tl;24=tl;25=tl;26=tl;27=tl;28=tl;29=tl;30=tl;31=tl;32=tl
>
> Multiplying that with the existing possible 256 monitor groups the limit
> is exceeded today.
That's useful to know. I guess we probably shouldn't just kick this
issue down the road, then -- at least on the read side (as you say).
> I understand that all domains having "tl" flags are not possible today, but
> even if that is changed to "_" the resulting display still looks to
> easily exceed a page when many RMIDs are in use.
>
> >
> > This looks perfectly fixable, but it might be better to settle the
> > design of this series first before we worry too much about it.
>
> I think it fair to delay support of writing more than a page of
> data but it looks to me like we need a solution to support displaying
> more than a page of data to user space.
>
> Reinette
>
> [1] https://lore.kernel.org/lkml/20241106154306.2721688-2-peternewman@xxxxxxxxxx/
Ack; if I can't find an off-the-shelf solution for this, I'll try to
hack something as minimal as possible to provide the required
behaviour, but I won't try to make it optimal or pretty for now.
It has just occurred to be that ftrace has large, multi-line text files
in sysfs, so I'll try to find out how they handle that there. Maybe
there is some infrastructure we can re-use.
Either way, hopefully that will move the discussion forward (unless
someone else comes up with a better idea first!)
Cheers
---Dave