Re: [PATCH v3 3/3] x86/resctrl: Implement rename op for mon groups

From: Peter Newman
Date: Fri Mar 03 2023 - 10:10:18 EST


Hi Reinette,

On Thu, Mar 2, 2023 at 11:27 PM Reinette Chatre
<reinette.chatre@xxxxxxxxx> wrote:
> On 3/2/2023 6:26 AM, Peter Newman wrote:
> > On Sat, Feb 11, 2023 at 12:21 AM Reinette Chatre
> > <reinette.chatre@xxxxxxxxx> wrote:
> >
> >> On 1/25/2023 2:13 AM, Peter Newman wrote:
> >>> + for_each_process_thread(p, t) {
> >>> + if (is_closid_match(t, prdtgrp) && is_rmid_match(t, rdtgrp))
> >>> + rdt_move_one_task(t, new_prdtgrp->closid, t->rmid,
> >>> + cpus);
> >>> + }
> >>> + read_unlock(&tasklist_lock);
> >>
> >> Can rdt_move_group_tasks() be used here?
> >
> > As it is today, rdt_move_group_tasks() would move too many tasks.
> > mongrp_move() needs both the CLOSID and RMID to match, while
> > rdt_move_group_tasks() only needs 0-1 of the two to match.
> >
> > I tried adding more parameters to rdt_move_group_tasks() to change the
> > move condition, but I couldn't make the resulting code not look gross
> > and after factoring the tricky stuff into rdt_move_one_task(),
> > rdt_move_group_tasks() didn't look interesting enough to reuse.
>
> Could it be made readable by adding a compare function as parameter to
> rdt_move_group_tasks() that is used to determine if a task should be moved?

Yes, I think that would be ok in this case. That shouldn't have any
cost if these are all static functions.

As long as we have an rdt_move_group_tasks() function, it's a liability
to have a separate task-moving loop for someone to miss in the future.

Should I still bother with factoring out rdt_move_one_task() in the
parent patch? It was motivated by my creating a new task-moving loop
in this patch.


> >>> +
> >>> + rdtgrp = kernfs_to_rdtgroup(kn);
> >>> + new_prdtgrp = kernfs_to_rdtgroup(new_parent);
> >>> + if (!rdtgrp || !new_prdtgrp) {
> >>> + free_cpumask_var(tmpmask);
> >>> + return -EPERM;
> >>> + }
> >>> +
> >>
> >> How robust is this against user space attempting to move files?
> >
> > I'm not sure I understand the question. Can you be more specific?
>
> This commit adds support for rename to resctrl. I thus expect this
> function to be called when user space attempts to move _any_ of
> the files and/or directories within resctrl. This could be out of
> curiosity, buggy, or maliciousness. I would like to understand how
> robust this code would be against such attempts because I do not see
> much checking before the preparation to do the move is started.

Now I see, thanks.

kernfs_to_rdtgroup() will return the parent rdtgroup when
kn or new_parent is a file, which will lead to kernfs_rename() moving
a file out of a group or clobbering another file node. I'll need to
enforce that kn and new_parent are rdtgroup directories and not file
nodes.

Assuming that the paths of kn and new_parent exactly match their
rdtgroup directories, I believe the checks below are sufficient to
constrain the operation to only moving MON groups to existing
mon_groups directories.


> >> Should in-place moves be allowed?
> >
> > I don't think it's useful in the context of the intended use case.
> > Also, it looks like kernfs_rename() would fail with EEXIST if I tried.
> >
>
> From what I can tell kernfs_rename() will return EEXIST if there
> is an attempt to create file/directory with the same name at the same place.
> What I am asking about is if user space requests to change the name
> of a monitoring group without moving it to a new resource group. This seems
> to be supported by this code but if it is supported it could likely be
> done more efficiently since no tasks need to be moved because neither
> closid nor rmid needs to change.

Yes, I see now. I'll try skipping the mongrp_move() call below when
new_parent is already the parent of rdtgrp to optimize the simple
rename use case.


> >> Can tmpmask allocation/free be done in mongrp_move()?
> >
> > Yes, but it looked like most other functions in this file allocate the
> > cpumask up front before validating parameters. If you have a preference
> > for internalizing its allocation within mongrp_move(), I can try it.
>
> Could you please elaborate what the concern is? From what I can tell
> mongrp_move() is the only user of the cpumask so it is not clear to
> me why it cannot take care of the allocation and free.
>
> When referring to existing code I assume you mean rdtgroup_rmdir(). This
> is the only code I could find in this file with this pattern. I looked
> back at the history and indeed the cpumask was allocated where it was
> used but the flow was changed to the current when support for monitoring
> groups were added. See f9049547f7e7 ("x86/intel_rdt: Separate the ctrl bits from rmdir")
> I do not see a requirement for doing the allocations in that way.

I looked over this again in more detail...

I need to choose whether to call kernfs_rename() or mongrp_move() first.
If the second call fails, the first needs to be reverted. It's feasible
to ensure that a mongrp_move() call will be successful before calling
kernfs_rename(), but not the other way around.

If I allow mongrp_move() to fail, kernfs_rename() should be reversible
thanks to the prior checks validating this use case, but I would prefer
to eliminate the need for a revert on cleanup entirely.

-Peter