Re: [PATCH 0/4] exportfs: add flag to allow marking export operations as only supporting file handles

From: Christian Brauner
Date: Tue Dec 10 2024 - 05:13:32 EST


On Mon, Dec 09, 2024 at 12:20:10PM -0500, Chuck Lever wrote:
> On 12/9/24 12:15 PM, Jeff Layton wrote:
> > On Mon, 2024-12-09 at 11:35 -0500, Chuck Lever wrote:
> > > On 12/9/24 11:30 AM, Amir Goldstein wrote:
> > > > On Mon, Dec 9, 2024 at 2:46 PM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> > > > >
> > > > > On Mon, Dec 09, 2024 at 09:58:58AM +0100, Amir Goldstein wrote:
> > > > > > To be clear, exporting pidfs or internal shmem via an anonymous fd is
> > > > > > probably not possible with existing userspace tools, but with all the new
> > > > > > mount_fd and magic link apis, I can never be sure what can be made possible
> > > > > > to achieve when the user holds an anonymous fd.
> > > > > >
> > > > > > The thinking behind adding the EXPORT_OP_LOCAL_FILE_HANDLE flag
> > > > > > was that when kernfs/cgroups was added exportfs support with commit
> > > > > > aa8188253474 ("kernfs: add exportfs operations"), there was no intention
> > > > > > to export cgroupfs over nfs, only local to uses, but that was never enforced,
> > > > > > so we thought it would be good to add this restriction and backport it to
> > > > > > stable kernels.
> > > > >
> > > > > Can you please explain what the problem with exporting these file
> > > > > systems over NFS is? Yes, it's not going to be very useful. But what
> > > > > is actually problematic about it? Any why is it not problematic with
> > > > > a userland nfs server? We really need to settle that argumet before
> > > > > deciding a flag name or polarity.
> > > > >
> > > >
> > > > I agree that it is not the end of the world and users do have to explicitly
> > > > use fsid= argument to be able to export cgroupfs via nfsd.
> > > >
> > > > The idea for this patch started from the claim that Jeff wrote that cgroups
> > > > is not allowed for nfsd export, but I couldn't find where it is not allowed.
> > > >
> >
> > I think that must have been a wrong assumption on my part. I don't see
> > anything that specifically prevents that either. If cgroupfs is mounted
> > and you tell mountd to export it, I don't see what would prevent that.
> >
> > To be clear, I don't see how you would trick bog-standard mountd into
> > exporting a filesystem that isn't mounted into its namespace, however.
> > Writing a replacement for mountd is always a possibilty.
> >
> > > > I have no issue personally with leaving cgroupfs exportable via nfsd
> > > > and changing restricting only SB_NOUSER and SB_KERNMOUNT fs.
> > > >
> > > > Jeff, Chuck, what is your opinion w.r.t exportability of cgroupfs via nfsd?
> > >
> > > We all seem to be hard-pressed to find a usage scenario where exporting
> > > pseudo-filesystems via NFS is valuable. But maybe someone has done it
> > > and has a good reason for it.
> > >
> > > The issue is whether such export should be consistently and actively
> > > prevented.
> > >
> > > I'm not aware of any specific security issues with it.
> > >
> > >
> >
> > I'm not either, but we are in new territory here. nfsd is a network
> > service, so it does present more of an attack surface vs. local access.
> >
> > In general, you do have to take active steps to export a filesystem,
> > but if someone exports / with "crossmnt", everything mounted is
> > potentially accessible. That's obviously a dumb thing to do, but people
> > make mistakes, and it's possible that doing this could be part of a
> > wider exploit.
> >
> > I tend to think it safest to make exporting via nfsd an opt-in thing on
> > a per-fs basis (along the lines of this patchset). If someone wants to
> > allow access to more "exotic" filesystems, let them argue their use-
> > case on the list first.
>
> If we were starting from scratch, 100% agree.
>
> The current situation is that these file systems appear to be exportable
> (and not only via NFS). The proposal is that this facility is to be
> taken away. This can easily turn into a behavior regression for someone
> if we're not careful.

So I'm happy to drop the exportfs preliminary we have now preventing
kernfs from being exported but then Christoph and you should figure out
what the security implications of allowing kernfs instances to be
exported areare because I'm not an NFS export expert.

Filesystems that fall under kernfs that are exportable by NFS as I
currently understand it are at least:

(1) sysfs
(2) cgroupfs

Has anyone ever actually tried to export the two and tested what
happens? Because I wouldn't be surprised if this ended in tears but
maybe I'm overly pessimistic.

Both (1) and (2) are rather special and don't have standard filesystem
semantics in a few places.

- cgroupfs isn't actually namespace aware. Whereas most filesystems like
tmpfs and ramfs that are mountable inside unprivileged containers are
multi-instance filesystems, aka allocate a new superblock per
container cgroupfs is single-instance with a nasty implementation to
virtualize the per-container view via cgroup namespaces. I wouldn't be
surprised if that ends up being problematic.

- Cgroupfs has write-time permission checks as the process that is moved
into a cgroup isn't known at open time. That has been exploitable
before this was fixed.

- Even though it's legacy cgroup has a v1 and v2 mode where v1 is even
more messed up than v2 including the release-agent logic which ends up
issuing a usermode helper to call a binary when a cgroup is released.

- sysfs potentially exposes all kinds of extremly low-level information
to a remote machine.

None of this gives me the warm and fuzzy. But that's just me.

Otherwise, I don't understand what it means that a userspace NFS server
can export kernfs instances. I don't know what that means and what the
contrast to in-kernel NFS server export is and whether that has the same
security implications. If so it's even scary that some random userspace
NFS server can just expose guts like kernfs.

But if both of you feel that this is safe to do and there aren't any
security issues lurking that have gone unnoticed simply because no one
has really ever exported sysfs or cgroupfs then by all means continue
allowing that. I'm rather skeptical.