Re: [PATCH v10 16/24] x86/resctrl: Add interface to the assign counter

From: Moger, Babu
Date: Thu Dec 19 2024 - 16:38:50 EST


Hi Reinette,

On 12/19/2024 3:12 PM, Reinette Chatre wrote:
Hi Babu,

On 12/19/24 11:45 AM, Moger, Babu wrote:
Hi Reinette,

On 12/18/2024 4:01 PM, Reinette Chatre wrote:


On 12/13/24 8:54 AM, Moger, Babu wrote:
On 12/13/2024 10:24 AM, Luck, Tony wrote:
It is right thing to continue assignment if one of the domain is out of
counters. In that case how about we save the error(say error_domain) and
continue. And finally return success if both ret and error_domain are zeros.

     return ret ?  ret : error_domain:

If there are many domains, then you might have 3 succeed and 5 fail.

I think the best you can do is return success if everything succeeded
and an error if any failed.

Yes. The above check should take care of this case.


If I understand correctly "error_domain" can capture the ID of
a single failing domain. If there are multiple failing domains like
in Tony's example then "error_domain" will not be accurate and thus
can never be trusted. Instead of a single check of a failure user
space is then forced to parse the more complex "mbm_assign_control"
file to learn what succeeded and failed.

Would it not be simpler to process sequentially and then fail on
first error encountered with detailed error message? With that
user space can determine exactly which portion of request
succeeded and which portion failed.

One more option is to print the error for each failure and continue. And finally return error.

"Group mon1, domain:1 Out of MBM counters"

We have the error information as well as the convenience of assignment on domains where counters are available when user is working with "*"(all domains).

This may be possible. Please keep in mind that any errors have to be
easily consumed in an automated way to support the user space tools
that interact with resctrl. I do not think we have thus far focused
on the "last_cmd_status" buffer as part of the user space ABI so this opens
up more considerations.

At this time the error handling of "all domains" does not seem to be
consistent and obvious to user space. From what I can tell the
implementation continues on to the next domain if one domain is out
of counters but it exits immediately if a counter cannot be configured
on a particular domain.

Yes. We can handle both the errors in the same way.



Note: I will be out of office starting next week Until Jan 10.

Thank you for letting me know. I am currently reviewing this series
and will post feedback by tomorrow.

Sure. Thanks. I will try to get to some of it at least. The review comments which needs investigation may have to wait. Lets see.

Thanks
Babu