Re: [PATCH] fs/namespace: notify pollers of legacy propagation changes
From: Guopeng Zhang
Date: Sun May 31 2026 - 22:22:26 EST
在 2026/5/29 18:23, Christian Brauner 写道:
> On Fri, May 29, 2026 at 05:54:41PM +0800, Guopeng Zhang wrote:
>> From: Guopeng Zhang <zhangguopeng@xxxxxxxxxx>
>>
>> Changing mount propagation through the legacy mount API changes
>> user-visible mountinfo contents, including the shared: and master:
>> optional fields.
>>
>> The mount_setattr() path already touches the mount namespace after
>> change_mnt_propagation(), so pollers of /proc/<pid>/mountinfo are woken
>> when the namespace event changes.
>>
>> The legacy mount --make-* path also changes propagation through
>> change_mnt_propagation(), and MOVE_MOUNT_SET_GROUP updates the
>> propagation relationship of the target mount. Both paths currently
>> return without touching the affected mount namespace.
>>
>> As a result, userspace polling /proc/<pid>/mountinfo can miss these
>> propagation-only changes even though mountinfo has changed.
>>
>> A simple reproducer that polls /proc/self/mountinfo while changing
>> propagation shows the inconsistency.
>>
>> Before this change:
>>
>> legacy MS_SHARED: poll ret=0 revents=0x0
>> mount_setattr MS_SHARED: poll ret=1 revents=0xa
>>
>> After this change:
>>
>> legacy MS_SHARED: poll ret=1 revents=0xa
>> mount_setattr MS_SHARED: poll ret=1 revents=0xa
>>
>> Touch the affected mount namespace after successfully changing
>> propagation state in do_change_type() and do_set_group(). Take the
>> vfsmount lock for write around touch_mnt_namespace(), as required by
>> its locking rules.
>>
>> Signed-off-by: Guopeng Zhang <zhangguopeng@xxxxxxxxxx>
>> ---
>> fs/namespace.c | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>>
>> diff --git a/fs/namespace.c b/fs/namespace.c
>> index 9a66a806a9b8..f871c7bf3bc8 100644
>> --- a/fs/namespace.c
>> +++ b/fs/namespace.c
>> @@ -2908,6 +2908,10 @@ static int do_change_type(const struct path *path, int ms_flags)
>> for (m = mnt; m; m = (recurse ? next_mnt(m, mnt) : NULL))
>> change_mnt_propagation(m, type);
>>
>> + lock_mount_hash();
>> + touch_mnt_namespace(mnt->mnt_ns);
>> + unlock_mount_hash();
>> +
>> return 0;
>> }
>>
>> @@ -3479,6 +3483,11 @@ static int do_set_group(const struct path *from_path, const struct path *to_path
>> list_add(&to->mnt_share, &from->mnt_share);
>> set_mnt_shared(to);
>> }
>> +
>> + lock_mount_hash();
>> + touch_mnt_namespace(to->mnt_ns);
>> + unlock_mount_hash();
>
> Doing this would cause seqcount readers to retry on mount propagation
> changes when all of them really only care about mount topology changes.
> So this can likely use:
>
> guard(mount_locked_reader)();
> touch_mnt_namespace(mnt_ns);
>
> Even today, observing an unchanged seqcount across mnt->mnt_flags reads
> doesn't guarantee that it really wasn't changed.
Hi Christian,
Thanks for the review and explanation.
I will send a v2 with the suggested changes.
Thanks,
Guopeng