Re: [PATCH] mm: do file ownership checks with the proper mount idmap

From: Pedro Falcato

Date: Mon Jun 29 2026 - 14:31:04 EST


On Mon, Jun 29, 2026 at 02:15:19PM +0200, Christian Brauner wrote:
> On 2026-06-26 16:19:18+02:00, Jan Kara wrote:
> > On Thu 25-06-26 16:38:53, Pedro Falcato wrote:
> >
> > > Ever since idmapped mounts were introduced, inode ownership checks
> > > (for side-channel protection) in mincore() and madvise(MADV_PAGEOUT) were
> > > done against the nop_mnt_idmap, which completely ignores the file's mount's
> > > idmap. This results in odd edgecases like:
> > >
> > > 1) mount/bind-mount with an idmap userA:userB:1
> > > 2) userB runs an owner_or_capable() check on file that is owned by userA
> > > on-disk/in-memory, but owned by userB after idmap translation
> > > 3) owner_or_capable() mysteriously fails as the correct idmap wasn't supplied
> > >
> > > In the case of mincore/madvise MADV_PAGEOUT, this is usually benign, because
> > > file_permission(file, MAY_WRITE) will probably succeed, as it uses the proper
> > > idmap internally, but it does not need to be the case on e.g a 0444 file
> > > where even the owner itself doesn't have permissions to write to it.
> > >
> > > Since this is clearly not trivial to get right, introduce a
> > > file_owner_or_capable() that can carry the correct semantics, and switch
> > > the various users in mm to it.
> > >
> > > The issue was found by manual code inspection & an off-list discussion with
> > > Jan Kara.
> > >
> > > Fixes: 9caccd41541a ("fs: introduce MOUNT_ATTR_IDMAP")
> > > Cc: stable@xxxxxxxxxxxxxxx
> > > Signed-off-by: Pedro Falcato <pfalcato@xxxxxxx>
> >
> > This looks good to me. I'm a bit curious why Christian initially (in 2021)
> > used init_user_ns here instead of the file namespace... Anyway feel free to
> > add:
>
> Back when this was added only the do_mincore() codepath existed and that
> was intentionally left unconverted because it exposes the cache
> residency status. So it was effectively a massive side-channel.

Hmm. I'm not sure what you mean by this. Wouldn't it be more correct to respect
the mount idmap (given that a mount-ns-capable user mounted it with an idmap for
someone else, or itself) for mincore? Am I missing something? Or maybe I'm
misunderstanding that paragraph.

>
> Both fd3b1bc3c86e ("mm/madvise: fix madvise_pageout for private file mappings")
> and specifically cachestat() came way after all that.
>
> I'm otherwise fine with the change.
>
> Reviewed-by: Christian Brauner (Amutable) <brauner@xxxxxxxxxx>

Thanks!

--
Pedro