Re: Upcoming: Notifications, FS notifications and fsinfo()
From: Miklos Szeredi
Date: Tue Mar 31 2020 - 11:10:49 EST
On Tue, Mar 31, 2020 at 2:25 PM Lennart Poettering <mzxreary@xxxxxxxxxxx> wrote:
>
> On Di, 31.03.20 10:56, Miklos Szeredi (miklos@xxxxxxxxxx) wrote:
>
> > On Tue, Mar 31, 2020 at 10:34 AM Karel Zak <kzak@xxxxxxxxxx> wrote:
> > >
> > > On Tue, Mar 31, 2020 at 07:11:11AM +0200, Miklos Szeredi wrote:
> > > > On Mon, Mar 30, 2020 at 11:17 PM Christian Brauner
> > > > <christian.brauner@xxxxxxxxxx> wrote:
> > > >
> > > > > Fwiw, putting down my kernel hat and speaking as someone who maintains
> > > > > two container runtimes and various other low-level bits and pieces in
> > > > > userspace who'd make heavy use of this stuff I would prefer the fd-based
> > > > > fsinfo() approach especially in the light of across namespace
> > > > > operations, querying all properties of a mount atomically all-at-once,
> > > >
> > > > fsinfo(2) doesn't meet the atomically all-at-once requirement.
> > >
> > > I guess your /proc based idea have exactly the same problem...
> >
> > Yes, that's exactly what I wanted to demonstrate: there's no
> > fundamental difference between the two API's in this respect.
> >
> > > I see two possible ways:
> > >
> > > - after open("/mnt", O_PATH) create copy-on-write object in kernel to
> > > represent mount node -- kernel will able to modify it, but userspace
> > > will get unchanged data from the FD until to close()
> > >
> > > - improve fsinfo() to provide set (list) of the attributes by one call
> >
> > I think we are approaching this from the wrong end. Let's just
> > ignore all of the proposed interfaces for now and only concentrate on
> > what this will be used for.
> >
> > Start with a set of use cases by all interested parties. E.g.
> >
> > - systemd wants to keep track attached mounts in a namespace, as well
> > as new detached mounts created by fsmount()
> >
> > - systemd need to keep information (such as parent, children, mount
> > flags, fs options, etc) up to date on any change of topology or
> > attributes.
>
> - We also have code that recursively remounts r/o or unmounts some
> directory tree (with filters),
Recursive remount-ro is clear. What is not clear is whether you need
to do this for hidden mounts (not possible from userspace without a
way to disable mount following on path lookup). Would it make sense
to add a kernel API for recursive setting of mount flags?
What exactly is this unmount with filters? Can you give examples?
> - We also have code that needs to check if /dev/ is plain tmpfs or
> devtmpfs. We cannot use statfs for that, since in both cases
> TMPFS_MAGIC is reported, hence we currently parse
> /proc/self/mountinfo for that to find the fstype string there, which
> is different for both cases.
Okay.
Thanks,
Miklos