Re: [PATCH v5 2/2] fuse: add new function to invalidate cache for all inodes

From: Bernd Schubert
Date: Mon Feb 17 2025 - 05:32:20 EST


On 2/17/25 11:07, Luis Henriques wrote:
> On Mon, Feb 17 2025, Bernd Schubert wrote:
>
>> On 2/16/25 17:50, Luis Henriques wrote:
>>> Currently userspace is able to notify the kernel to invalidate the cache
>>> for an inode. This means that, if all the inodes in a filesystem need to
>>> be invalidated, then userspace needs to iterate through all of them and do
>>> this kernel notification separately.
>>>
>>> This patch adds a new option that allows userspace to invalidate all the
>>> inodes with a single notification operation. In addition to invalidate
>>> all the inodes, it also shrinks the sb dcache.
>>>
>>> Signed-off-by: Luis Henriques <luis@xxxxxxxxxx>
>>> ---
>>> fs/fuse/inode.c | 33 +++++++++++++++++++++++++++++++++
>>> include/uapi/linux/fuse.h | 3 +++
>>> 2 files changed, 36 insertions(+)
>>>
>>> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
>>> index e9db2cb8c150..01a4dc5677ae 100644
>>> --- a/fs/fuse/inode.c
>>> +++ b/fs/fuse/inode.c
>>> @@ -547,6 +547,36 @@ struct inode *fuse_ilookup(struct fuse_conn *fc, u64 nodeid,
>>> return NULL;
>>> }
>>>
>>> +static int fuse_reverse_inval_all(struct fuse_conn *fc)
>>> +{
>>> + struct fuse_mount *fm;
>>> + struct inode *inode;
>>> +
>>> + inode = fuse_ilookup(fc, FUSE_ROOT_ID, &fm);
>>> + if (!inode || !fm)
>>> + return -ENOENT;
>>> +
>>> + /* Remove all possible active references to cached inodes */
>>> + shrink_dcache_sb(fm->sb);
>>> +
>>> + /* Remove all unreferenced inodes from cache */
>>> + invalidate_inodes(fm->sb);
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +/*
>>> + * Notify to invalidate inodes cache. It can be called with @nodeid set to
>>> + * either:
>>> + *
>>> + * - An inode number - Any pending writebacks within the rage [@offset @len]
>>> + * will be triggered and the inode will be validated. To invalidate the whole
>>> + * cache @offset has to be set to '0' and @len needs to be <= '0'; if @offset
>>> + * is negative, only the inode attributes are invalidated.
>>> + *
>>> + * - FUSE_INVAL_ALL_INODES - All the inodes in the superblock are invalidated
>>> + * and the whole dcache is shrinked.
>>> + */
>>> int fuse_reverse_inval_inode(struct fuse_conn *fc, u64 nodeid,
>>> loff_t offset, loff_t len)
>>> {
>>> @@ -555,6 +585,9 @@ int fuse_reverse_inval_inode(struct fuse_conn *fc, u64 nodeid,
>>> pgoff_t pg_start;
>>> pgoff_t pg_end;
>>>
>>> + if (nodeid == FUSE_INVAL_ALL_INODES)
>>> + return fuse_reverse_inval_all(fc);
>>> +
>>> inode = fuse_ilookup(fc, nodeid, NULL);
>>> if (!inode)
>>> return -ENOENT;
>>> diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
>>> index 5e0eb41d967e..e5852b63f99f 100644
>>> --- a/include/uapi/linux/fuse.h
>>> +++ b/include/uapi/linux/fuse.h
>>> @@ -669,6 +669,9 @@ enum fuse_notify_code {
>>> FUSE_NOTIFY_CODE_MAX,
>>> };
>>>
>>> +/* The nodeid to request to invalidate all inodes */
>>> +#define FUSE_INVAL_ALL_INODES 0
>>> +
>>> /* The read buffer is required to be at least 8k, but may be much larger */
>>> #define FUSE_MIN_READ_BUFFER 8192
>>>
>>
>>
>> I think this version might end up in
>>
>> static void fuse_evict_inode(struct inode *inode)
>> {
>> struct fuse_inode *fi = get_fuse_inode(inode);
>>
>> /* Will write inode on close/munmap and in all other dirtiers */
>> WARN_ON(inode->i_state & I_DIRTY_INODE);
>>
>>
>> if the fuse connection has writeback cache enabled.
>>
>>
>> Without having it tested, reproducer would probably be to run
>> something like passthrough_hp (without --direct-io), opening
>> and writing to a file and then sending FUSE_INVAL_ALL_INODES.
>
> Thanks, Bernd. So far I couldn't trigger this warning. But I just found
> that there's a stupid bug in the code: a missing iput() after doing the
> fuse_ilookup().
>
> I'll spend some more time trying to understand how (or if) the warning you
> mentioned can triggered before sending a new revision.
>

Maybe I'm wrong, but it calls

invalidate_inodes()
dispose_list()
evict(inode)
fuse_evict_inode()

and if at the same time something writes to inode page cache, the
warning would be triggered?
There are some conditions in evict, like inode_wait_for_writeback()
that might protect us, but what is if it waited and then just
in the right time the another write comes and dirties the inode
again?


Thanks,
Bernd