[PATCH] mm: do file ownership checks with the proper mount idmap
From: Pedro Falcato
Date: Thu Jun 25 2026 - 11:42:39 EST
Ever since idmapped mounts were introduced, inode ownership checks
(for side-channel protection) in mincore() and madvise(MADV_PAGEOUT) were
done against the nop_mnt_idmap, which completely ignores the file's mount's
idmap. This results in odd edgecases like:
1) mount/bind-mount with an idmap userA:userB:1
2) userB runs an owner_or_capable() check on file that is owned by userA
on-disk/in-memory, but owned by userB after idmap translation
3) owner_or_capable() mysteriously fails as the correct idmap wasn't supplied
In the case of mincore/madvise MADV_PAGEOUT, this is usually benign, because
file_permission(file, MAY_WRITE) will probably succeed, as it uses the proper
idmap internally, but it does not need to be the case on e.g a 0444 file
where even the owner itself doesn't have permissions to write to it.
Since this is clearly not trivial to get right, introduce a
file_owner_or_capable() that can carry the correct semantics, and switch
the various users in mm to it.
The issue was found by manual code inspection & an off-list discussion with
Jan Kara.
Fixes: 9caccd41541a ("fs: introduce MOUNT_ATTR_IDMAP")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Pedro Falcato <pfalcato@xxxxxxx>
---
I noticed there are a couple of call sites in fs/ that could perhaps be
cleaned up with the added helper, but I'm skipping that for now for brevity's
sake.
include/linux/fs.h | 5 +++++
mm/filemap.c | 2 +-
mm/madvise.c | 3 +--
mm/mincore.c | 3 +--
4 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d10897b3a1e3..50ce731a2b78 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2444,6 +2444,11 @@ static inline struct mnt_idmap *file_mnt_idmap(const struct file *file)
return mnt_idmap(file->f_path.mnt);
}
+static inline bool file_owner_or_capable(const struct file *file)
+{
+ return inode_owner_or_capable(file_mnt_idmap(file), file_inode(file));
+}
+
/**
* is_idmapped_mnt - check whether a mount is mapped
* @mnt: the mount to check
diff --git a/mm/filemap.c b/mm/filemap.c
index 5af62e6abca5..58eb9d240643 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -4704,7 +4704,7 @@ static inline bool can_do_cachestat(struct file *f)
{
if (f->f_mode & FMODE_WRITE)
return true;
- if (inode_owner_or_capable(file_mnt_idmap(f), file_inode(f)))
+ if (file_owner_or_capable(f))
return true;
return file_permission(f, MAY_WRITE) == 0;
}
diff --git a/mm/madvise.c b/mm/madvise.c
index cd9bb077072c..77552b03d318 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -336,8 +336,7 @@ static inline bool can_do_file_pageout(struct vm_area_struct *vma)
* otherwise we'd be including shared non-exclusive mappings, which
* opens a side channel.
*/
- return inode_owner_or_capable(&nop_mnt_idmap,
- file_inode(vma->vm_file)) ||
+ return file_owner_or_capable(vma->vm_file) ||
file_permission(vma->vm_file, MAY_WRITE) == 0;
}
diff --git a/mm/mincore.c b/mm/mincore.c
index 296f2e3922b5..c8757c5085bf 100644
--- a/mm/mincore.c
+++ b/mm/mincore.c
@@ -227,8 +227,7 @@ static inline bool can_do_mincore(struct vm_area_struct *vma)
* for writing; otherwise we'd be including shared non-exclusive
* mappings, which opens a side channel.
*/
- return inode_owner_or_capable(&nop_mnt_idmap,
- file_inode(vma->vm_file)) ||
+ return file_owner_or_capable(vma->vm_file) ||
file_permission(vma->vm_file, MAY_WRITE) == 0;
}
--
2.54.0