Re: [RESEND PATCH v7 1/8] kernfs: Introduce interface to access global kernfs_open_file_mutex.

From: Imran Khan
Date: Wed Apr 13 2022 - 20:01:58 EST


Hello Al, Hello Tejun,

I have sent v8 of the patchset at [1]. I have incorporated your
suggestions and have also addressed the issue of not locking correct
nodes during kernfs_walk_ns.
I have not yet make the changes to make kernfs_find_ns use qstr because
this part is not clear to me. My understanding is that kernfs_find_ns
is looking for node of given name under a parent, so we need a buffer
in kernfs_walk_ns to hold the full path and then use strsep to take each
path component and look for it under parent (the node obtained during
previous iteration). For sure I am missing something from your
suggestion, about using qstr and removing strsep, but not sure what.

Could you please have a look at current version and let me know your
feedback?

Thanks
-- Imran

[1]
https://lore.kernel.org/lkml/20220410023719.1752460-1-imran.f.khan@xxxxxxxxxx/

On 6/4/22 2:54 pm, Imran Khan wrote:
> Hello Al,
>
> On 6/4/22 12:24 am, Al Viro wrote:
> [...]
>>
>> What for? Again, have kernfs_drain_open_files() do this:
>> {
>> struct kernfs_open_node *on;
>> struct kernfs_open_file *of;
>>
>> if (!(kn->flags & (KERNFS_HAS_MMAP | KERNFS_HAS_RELEASE)))
>> return;
>> if (rcu_dereference(kn->attr.open) == NULL)
>> return;
>> mutex_lock(&kernfs_open_file_mutex);
>> // now ->attr.open is stable (all stores are under kernfs_open_file_mutex)
>> on = rcu_dereference(kn->attr.open);
>> if (!on) {
>> mutex_unlock(&kernfs_open_file_mutex);
>> return;
>> }
>> // on->files contents is stable
>> list_for_each_entry(of, &on->files, list) {
>> struct inode *inode = file_inode(of->file);
>>
>> if (kn->flags & KERNFS_HAS_MMAP)
>> unmap_mapping_range(inode->i_mapping, 0, 0, 1);
>>
>> if (kn->flags & KERNFS_HAS_RELEASE)
>> kernfs_release_file(kn, of);
>> }
>> mutex_unlock(&kernfs_open_file_mutex);
>> }
>>
>
> I did something similar in in [1], except that I was traversing
> on->files under rcu_read_lock and this was a source of confusion.
>
>> What's the problem? The caller has already guaranteed that no additions will
>> happen. Once we'd grabbed kernfs_open_file_mutex, we know that
>> * kn->attr.open value won't change until we drop the mutex
>> * nothing gets removed from kn->attr.open->files until we drop the mutex
>> so we can bloody well walk that list, blocking as much as we want.
>>
>> We don't need rcu_read_lock() there - we are already holding the mutex used
>> by writers for exclusion among themselves. RCU *allows* lockless readers,
>> it doesn't require all readers to be such. kernfs_notify() can be made
>> lockless, this one can't and that's fine.
>>
>
> Thanks for explaining this. I missed the exclusiveness being provided by
> kernfs_open_file_mutex in this case.
>
>> BTW, speaking of kernfs_notify() - can calls of that come from NMI handlers?
>> If not, I'd consider using llist for kernfs_notify_list...
>
> I see it gets invoked from 3 places only: cgroup_file_notify,
> sysfs_notify and sysfs_notify_dirent. So kernfs_notify should not be
> getting invoked in NMI context. I will make the llist transition in next
> version.
>
> Thanks,
> -- Imran
>
> [1]
> https://lore.kernel.org/lkml/20220324103040.584491-3-imran.f.khan@xxxxxxxxxx/