Re: [PATCH v6 5/8] x86/resctrl: Unwind the errors inside rdt_enable_ctx

From: Reinette Chatre
Date: Fri Aug 04 2023 - 16:42:01 EST


Hi Babu,

On 7/19/2023 4:22 PM, Babu Moger wrote:
> rdt_enable_ctx() takes care of enabling the features provided during

"rdt_enable_ctx() takes care of enabling" can just be "rdt_enable_ctx()
enables"

> resctrl mount. The error unwinding of rdt_enable_ctx is done from the
> caller rdt_get_tree. This is not ideal and can cause some error unwinding
> to be omitted.
>

Please consistently use () to indicate function names (in
changelog and subject).

"This is not ideal and can cause some error unwinding to be omitted."
is a bit vague. How about (in a new paragraph):
"Additions to rdt_enable_ctx() are required to also modify error paths
of rdt_enable_ctx() callers to ensure correct unwinding if errors
are encountered after calling rdt_enable_ctx(). This is error prone."

> Fix this by moving all the error unwinding inside rdt_enable_ctx.

"Fix" creates expectation for a "fixes" tag which is not needed here. This
refactors code to simplify future additions.

Even so, I do not think this solution addresses the stated problem (more
below).

>
> Suggested-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
> ---
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 31 +++++++++++++++++++++++--------
> 1 file changed, 23 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 3010e3a1394d..9a7204f71d2d 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -2381,15 +2381,31 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx)
> {
> int ret = 0;
>
> - if (ctx->enable_cdpl2)
> + if (ctx->enable_cdpl2) {
> ret = resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L2, true);
> + if (ret)
> + goto out;
> + }
>
> - if (!ret && ctx->enable_cdpl3)
> + if (ctx->enable_cdpl3) {
> ret = resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L3, true);
> + if (ret)
> + goto out_cdpl2;
> + }
>
> - if (!ret && ctx->enable_mba_mbps)
> + if (ctx->enable_mba_mbps) {
> ret = set_mba_sc(true);
> + if (ret)
> + goto out_cdpl3;
> + }
>
> + return 0;
> +
> +out_cdpl3:
> + resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L3, false);
> +out_cdpl2:
> + resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L2, false);

Be careful here. There is no dependency between L3 and L2 CDP ...
if L3 CDP was enabled it does not mean that L2 CDP was enabled also.
Similarly, if the software controller was enabled it does not mean that
CDP was also enabled.
Since resctrl_arch_set_cdp_enabled() does much more than just change
a flag value I think these should first check if it was enabled
before disabling the feature.

> +out:
> return ret;
> }
>
> @@ -2497,13 +2513,13 @@ static int rdt_get_tree(struct fs_context *fc)
> }
>
> ret = rdt_enable_ctx(ctx);
> - if (ret < 0)
> - goto out_cdp;
> + if (ret)
> + goto out;
>
> ret = schemata_list_create();
> if (ret) {
> schemata_list_destroy();
> - goto out_mba;
> + goto out_ctx;
> }
>
> closid_init();
> @@ -2562,10 +2578,9 @@ static int rdt_get_tree(struct fs_context *fc)
> kernfs_remove(kn_info);
> out_schemata_free:
> schemata_list_destroy();
> -out_mba:
> +out_ctx:
> if (ctx->enable_mba_mbps)
> set_mba_sc(false);
> -out_cdp:
> cdp_disable_all();
> out:
> rdt_last_cmd_clear();
>

The problem statement in the changelog was that rdt_get_tree() is
doing error unwinding of rdt_enable_ctx(). Looking at the above it
seems that the problem remains ... callers of rdt_enable_ctx()
still need to know all internals of that function to do error
unwind correctly. Could it perhaps be made simpler with a new
rdt_disable_ctx() that undoes rdt_enable_ctx()? New additions
to rdt_enable_ctx() would have more clarity where changes are
needed and callers only need to call a single rdt_disable_ctx().

Reinette