Re: [PATCH] fuse: fix inode initialization race

From: Bernd Schubert

Date: Thu Mar 26 2026 - 15:17:59 EST




On 3/26/26 20:00, Joanne Koong wrote:
> On Thu, Mar 26, 2026 at 11:16 AM Bernd Schubert <bernd@xxxxxxxxxxx> wrote:
>>
>>
>> On 3/26/26 19:00, Joanne Koong wrote:
>>> On Thu, Mar 26, 2026 at 10:54 AM Horst Birthelmer <horst@xxxxxxxxxxxxx> wrote:
>>>>
>>>> On Thu, Mar 26, 2026 at 09:43:00AM -0700, Joanne Koong wrote:
>>>>> On Thu, Mar 26, 2026 at 8:48 AM Horst Birthelmer <horst@xxxxxxxxxxxxx> wrote:
>>>>>>
>>>>>> On Thu, Mar 26, 2026 at 04:19:24PM +0100, Miklos Szeredi wrote:
>>>>>>> On Thu, 26 Mar 2026 at 16:13, Bernd Schubert <bernd@xxxxxxxxxxx> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3/26/26 15:26, Christian Brauner wrote:
>>>>>>>>> On Wed, Mar 25, 2026 at 08:54:57AM +0100, Bernd Schubert wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 3/18/26 14:43, Horst Birthelmer wrote:
>>>>>>>>>>> From: Horst Birthelmer <hbirthelmer@xxxxxxx>
>>>>>>>
>>>>>>>>>>> fi->attr_version = atomic64_inc_return(&fc->attr_version);
>>>>>>>>>>> + wake_up_all(&fc->attr_version_waitq);
>>>>>>>>>>> fi->i_time = attr_valid;
>>>>>>>>
>>>>>>>>
>>>>>>>> While I'm looking at this again, wouldn't it make sense to make this
>>>>>>>> conditional? Because we wake this queue on every attr change for every
>>>>>>>> inode. And the conditional in fuse_iget() based on I_NEW?
>>>>>>>
>>>>>>> Right, should only wake if fi->attr_version old value was zero.
>>>>>>>
>>>>>>> BTW I have a hunch that there are better solutions, but it's simple
>>>>>>> enough as a stopgap measure.
>>>>>>
>>>>>> OK, I'll send a new version.
>>>>>>
>>>>>> Just out of curiosity, what would be a better solution?
>>>>>
>>>>> I'm probably missing something here but why can't we just call the
>>>>>
>>>>> fi = get_fuse_inode(inode);
>>>>> spin_lock(&fi->lock);
>>>>> fi->nlookup++;
>>>>> spin_unlock(&fi->lock);
>>>>> fuse_change_attributes_i(inode, attr, NULL, attr_valid, attr_version,
>>>>> evict_ctr);
>>>>>
>>>>> logic before releasing the inode lock (the unlock_new_inode() call) in
>>>>> fuse_iget() to address the race? unlock_new_inode() clears I_NEW so
>>>>> fuse_reverse_inval_inode()'s fuse_ilookup() would only get the inode
>>>>> after the attributes initialization has finished.
>>>>>
>>>>> As I understand it, fuse_change_attributes_i() would be pretty
>>>>> straightforward / fast for I_NEW inodes, as it doesn't send any
>>>>> synchronous requests and for the I_NEW case the
>>>>> invalidate_inode_pages2() and truncate_pagecache() calls would get
>>>>> skipped. (truncate_pagecache() getting skipped because inode->i_size
>>>>> is already attr->size from fuse_init_inode(), so "oldsize !=
>>>>> attr->size" is never true; and invalidate_inode_pages2() getting
>>>>> skipped because "oldsize != attr->size" is never true and "if
>>>>> (!timespec64_equal(&old_mtime, &new_mtime))" is never true because
>>>>> fuse_init_inode() initialized the inode's mtime to attr->mtime).
>>>>
>>>> You understand the pretty well, I think.
>>>> The problem I have there is that fuse_change_attributes_i() takes
>>>> its own lock.
>>>> That would be a pretty big operation to split that function.
>>>
>>> I believe fuse_change_attribtues_i() takes the fi lock, not the inode
>>> lock, so this should be fine.
>>>
>>
>>
>> Ah, you want to update the attributes before unlock_new_inode()? The
>> risk I see is that I don't imediately see all code paths of
>> truncate_pagecache and invalidate_inode_pages2. Even if there is no
>
> We never call into truncate_pagecache() or invalidate_inode_pages2()
> from fuse_change_attributes_i():
> truncate_pagecache() is gated by oldsize != attr->size:
> for I_NEW inodes, fuse_init_inode() sets inode->i_size to
> attr->size, which means "oldsize != attr->size" is alwyas going to be
> false
>
> invalidate_inode_pages2() is gated by oldsize != attr->size and
> "!timespec64_equal(&old_mtime, &new_mtime)" (where new_mtime ses the
> mtime values from attr):
> for I_NEW inodes, fuse_init_inode() calls inode_set_mtime(inode,
> attr->mtime, attr->mtimensec);, which means
> "!timespec64_equal(&old_mtime, &new_mtime)" is always going to be
> false
>
>> issue right, who would easily notice the fuse behavior in the future.
>> I kind of agree with that method if the condition would be
>>
>> if (oldsize > attr->size) {
>> truncate_pagecache(inode, attr->size);
>>
>> and I don't understand why it is '!='. I.e. for a new inode oldsize
>> would be 0 - the condition would never trigger and my concern would
>
> For a new inode, oldsize is attr->size, not 0. The condition never triggers.
>
> fuse_iget() calls fuse_init_inode() (which sets inode->i_size to
> attr->size) before it calls fuse_change_attributes_i().


Yeah, I just see it, had missed that before. Your patch version sounds
good to me then.


Thanks,
Bernd