Re: [RFC PATCH bpf-next 0/3] bpf: freeze a task cgroup from bpf
From: Djalal Harouni
Date: Tue Apr 02 2024 - 13:41:49 EST
Hello,
On 3/30/24 00:04, Alexei Starovoitov wrote:
> On Fri, Mar 29, 2024 at 2:39 PM Tejun Heo <tj@xxxxxxxxxx> wrote:
>>
>> Hello,
>>
>> On Fri, Mar 29, 2024 at 02:22:28PM +0100, Djalal Harouni wrote:
>>> It would be easy at least for me if I just start with cgroupv2 and
>>> ensure that it has same available filenames as if we go through kernfs.
>>> Not a root cgroup node and maybe only freeze and kill for now that are
>>> part of cgroup_base_files.
>>>
>>> So if I get it right, somehow like what I did but we endup with:
>>>
>>> In bpf, cgroup was already acquired.
>>>
>>> bpf_cgroup_knob_write(cgroup, "freeze", buf)
>>> |_ parse params -> lock cgroup_mutex -> cgroup_freeze() -> unlock
>>>
>>>
>>> cgroup_freeze_write(struct kernfs_open_file *of, char *buf,...)
>>> |_ parse params -> cgroup_ref++ -> krnfs_active_ref-- ->
>>> -> lock cgroup_mutex -> cgroup_freeze() -> unlock + krnfs++ ...
>>>
>>> Please let me know if I missed something.
>>
>> I've thought about it a bit and I wonder whether a better way to do this is
>> implementing this at the kernfs layer. Something like (hopefully with a
>> better name):
>>
>> s32 bpf_kernfs_knob_write(struct kernfs_node *dir, const char *knob, char *buf);
>>
>> So, about the same, but takes kernfs_node directory instead of cgroup. This
>> would make the interface useful for accessing sysfs knobs too which use
>> similar conventions. For cgroup, @dir is just cgrp->kn and for sysfs it'd be
>> kobj->sd. This way we can avoid the internal object -> path -> internal
>> object ping-poinging while keeping the interface a lot more generic. What do
>> you think?
>
> And helpers like cgroup_freeze_write() will be refactored
> to take kernfs_node directly instead of kernfs_open_file?
> Makes sense to me.
> Sounds like a minimal amount of changes and flexible enough.
Thank you Alexei, Tejun for the feedback. Will try to get back with a v2.
One particular thing is the kernfs_open_file->mutex nests outside of the
refcounting of kernfs_node, let's see.
Thanks!