Re: [PATCH] kernfs: fix the race in the creation of negative dentry

From: Hou Tao
Date: Thu Sep 23 2021 - 00:34:59 EST


Hi,

On 9/23/2021 10:50 AM, Ian Kent wrote:
> On Thu, 2021-09-23 at 09:52 +0800, Hou Tao wrote:
>> Hi,
>>
>> On 9/15/2021 10:09 AM, Ian Kent wrote:
>>> On Wed, 2021-09-15 at 09:35 +0800, Ian Kent wrote:
>>>
>> Sorry for the late reply.
>>> I think something like this is needed (not even compile tested):
>>>
>>> kernfs: dont create a negative dentry if node exists
>>>
>>> From: Ian Kent <raven@xxxxxxxxxx>
>>>
>>> In kernfs_iop_lookup() a negative dentry is created if associated
>>> kernfs
>>> node is incative which makes it visible to lookups in the VFS path
>>> walk.
>>>
>>> But inactive kernfs nodes are meant to be invisible to the VFS and
>>> creating a negative for these can have unexpetced side effects.
>>>
>>> Signed-off-by: Ian Kent <raven@xxxxxxxxxx>
>>> ---
>>>  fs/kernfs/dir.c |    9 ++++++++-
>>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
>>> index ba581429bf7b..a957c944cf3a 100644
>>> --- a/fs/kernfs/dir.c
>>> +++ b/fs/kernfs/dir.c
>>> @@ -1111,7 +1111,14 @@ static struct dentry
>>> *kernfs_iop_lookup(struct inode *dir,
>>>  
>>>         kn = kernfs_find_ns(parent, dentry->d_name.name, ns);
>>>         /* attach dentry and inode */
>>> -       if (kn && kernfs_active(kn)) {
>>> +       if (kn) {
>>> +               /* Inactive nodes are invisible to the VFS so don't
>>> +                * create a negative.
>>> +                */
>>> +               if (!kernfs_active(kn)) {
>>> +                       up_read(&kernfs_rwsem);
>>> +                       return NULL;
>>> +               }
>>>                 inode = kernfs_get_inode(dir->i_sb, kn);
>>>                 if (!inode)
>>>                         inode = ERR_PTR(-ENOMEM);
>>>
>>>
>>> Essentially, the definition a kernfs negative dentry, for the
>>> cases it is meant to cover, is one that has no kernfs node, so
>>> one that does have a node should not be created as a negative.
>>>
>>> Once activated a subsequent ->lookup() will then create a
>>> positive dentry for the node so that no invalidation is
>>> necessary.
>> I'm fine with the fix which is much simpler.
> Great, although I was hoping you would check it worked as expected.
> Did you check?
> If not could you please do that check?
Yes, I will test whether or not it fixes the race.
>>> This distinction is important because we absolutely do not want
>>> negative dentries created that aren't necessary. We don't want to
>>> leave any opportunities for negative dentries to accumulate if
>>> we don't have to.
>>>    
>>> I am still thinking about the race you have described.
>>>
>>> Given my above comments that race might have (maybe probably)
>>> been present in the original code before the rwsem change but
>>> didn't trigger because of the serial nature of the mutex.
>> I don't think there is such race before the enabling of negative
>> dentry,
>> but maybe I misunderstanding something.
> No, I think you're probably right, it's the introduction of using
> negative dentries to prevent the expensive dentry alloc/free cycle
> of frequent lookups of non-existent paths that's exposed the race.
>
>>> So it may be wise (perhaps necessary) to at least move the
>>> activation under the rwsem (as you have done) which covers most
>>> of the change your proposing and the remaining hunk shouldn't
>>> do any harm I think but again I need a little more time on that.
>> After above fix, doing sibling tree operation and activation
>> atomically
>> will reduce the unnecessary lookup, but I don't think it is necessary
>> for the fix of race.
> Sorry, I don't understand what your saying.
>
> Are you saying you did check my suggested patch alone and it
> resolved the problem. And that you also think the small additional
> dentry churn is ok too.
I haven't tested it, but I think it is OK. And moving the activation under
the rwsem is not necessary for the problem.

Regards,
Tao
>
> If so I agree, and I'll forward the patch to Greg, ;)
>
> Ian
>> Regards,
>> Tao
>>> I'm now a little concerned about the invalidation that should
>>> occur on deactivation so I want to have a look at that too but
>>> it's separate to this proposal.
>>> Greg, Tejun, Hou, any further thoughts on this would be most
>>> welcome.
>>>
>>> Ian
>>> .
>
> .