Re: [RFC][PATCH 0/5] Mount, Filesystem and Keyrings notifications

From: Ian Kent
Date: Wed Jul 25 2018 - 21:18:44 EST


On Wed, 2018-07-25 at 08:48 -0700, Casey Schaufler wrote:
> On 7/24/2018 10:39 PM, Ian Kent wrote:
> > On Tue, 2018-07-24 at 11:57 -0700, Casey Schaufler wrote:
> > > On 7/24/2018 9:00 AM, David Howells wrote:
> > > > Casey Schaufler <casey@xxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > > > (1) Mount topology and reconfiguration change events.
> > > > >
> > > > > With the possibility of unprivileged mounting you're going to have to
> > > > > address access control on events. If root in a user namespace mounts
> > > > > a
> > > > > filesystem you may have a case where the "real" user wouldn't want the
> > > > > listener to receive a notification.
> > > >
> > > > Can you clarify who the listener is in this case?
> > >
> > > That would be anyone with a watchpoint set.
> >
> > And that process would have had the privilege to do so ...
>
> Which isn't the point. The access control isn't on the watchpoint,
> it's on delivering the event to the watchpoint. The access control
> needs to be based on the process that created the event and the
> process receiving the event.
>
> >
> > > > Note that mount topology events don't leak outside of the mount
> > > > namespace
> > > > they're generated in.
> > > >
> > > > That said, if you, a random user, put a watchpoint on "/" you can see
> > > > the
> > > > mount events triggered by another random user in the same mount
> > > > namespace. I
> > > > don't see a way to control this except by resorting to the LSM since
> > > > UNIX
> > > > doesn't have 'notify' permission bits.
> > >
> > > I would call that a write operation from the process that triggered
> > > the watchpoint to the one watching it. Like a signal. Signals have a
> > > rudimentary DAC policy (write only to the same UID) that could be
> > > your model.
> >
> > I'm not sure signals are a good comparison.
>
> In both cases you have one process sending information
> to another. If you use killpg() instead of kill() you can
> send to a number of processes without knowing them individually.
> A process can chose what to do with (most) signals, including
> ignore them.

I'm just saying that the analogy isn't quite the same.

In the this case notifications can't be just sent by a process,
it's entirely controlled by the VFS so the notion of some process
doing something out-off-band doesn't apply.

>
> > They can affect a process in significant ways whereas triggering
> > a notification is less invasive so the security requirements
> > should take that into consideration.
>
> I'm looking at this from a security model viewpoint. If process
> A sends information to process B that's a write operation with
> A as the subject and B as the object.

Perhaps, but again there's no process A deciding to send something.

It's the VFS saying to itself, I see this thing has changed, someone
that had appropriate privilege asked me to tell it so ...

I may be wrong but there isn't a similar access control mechanism
on the proc file system mount table (pseudo) files, only the usual
access control on opening them and what is seen is based on the
mount name space of the opening process.

This is also not quite as what we have here but it is similar.

>
> > But there is a problem here I think.
> >
> > How about the case where a user name space is created or entered
> > without a newly created mount name space and mounts and umounts
> > are done, the user name space necessarily expects the table of
> > mounts it sees to be up to date.
> >
> > But, if the methods here are used by user space, say libmount
> > was updated to use it, to gain the efficiency of not constantly
> > re-reading the proc mount table then restrictions on notifications
> > would mean the mount table seen in the user name space might not
> > be updated and would no longer be correct.
>
> Hey, I'm not the one saying that containers don't need/want kernel
> manifestation. Of course you may have troubles with access consistency
> if you go around changing the access control attributes with namespaces.
> One of the problems with signals is that you can't chmod() your process
> to allow signals from other specified users. If you don't design an
> access control scheme into the watchpoint mechanism you aren't going to
> be able to deal with situations like this.

Don't get me wrong, I'm not criticizing your suggestion that
some sort of security model is needed.

I'm saying it's not straight forward to work out what's actually
needed and, in thinking about it, I ended up described an additional
potential problem that's not even (necessarily) security related.

>
> > The converse is more interesting, where the user name space does
> > create or enter a new mount name space, then libmount would see ???,
> > probably not the updated mount information ... unless it opens a
> > new file handle to get mount update information ... a long running
> > daemon that uses libmount and dispenses or uses mount information
> > would very likely have a problem ...
> >
> > The current proc file system method or providing the mount table
> > forces a new file handle to be opened whenever getting the mount
> > table so it always sees only the current mount name space mount
> > table.
> >
> > At the very least I need to think more about this ...
> >
> > > > But for each event, I can associate an object label, derived from the
> > > > source,
> > > > and use f_cred on the notification queue to provide a subject label.
> > >
> > > ... or UID or groups.
> > >
> > > > > > (2) Superblocks EIO, ENOSPC and EDQUOT events (not complete yet).
> > > > >
> > > > > Here, too. If SELinux (for example) policy says you can't see
> > > > > anything on a filesystem you shouldn't get notifications about
> > > > > things that happen to that filesystem.
> > > >
> > > > Yep. Sounds like I need to refer that to the LSM as above.
> > > >
> > > > It's a bit easier for specifically nominated sb sources since you might
> > > > only
> > > > need to do the check once at sb_notify() time. If there's a general
> > > > queue
> > > > that all sbs contribute to, however, then things become more complicated
> > > > as
> > > > the checks have to be done at do-we-write-into-this-queue? time.
> > > >
> > > > > > (3) Key/keyring changes events
> > > > >
> > > > > And again, I should only get notifications about keys and
> > > > > keyrings I have access to.
> > > >
> > > > Currently, you can only watch keys that grant you View permission, which
> > > > might
> > > > suffice.
> > >
> > > That seems appropriate.
> > >
> > > > > I expect that you intentionally left off
> > > > >
> > > > > (4) User injected events
> > > > >
> > > > > at this point, but it's an obvious extension. That is going
> > > > > to require access controls (remember kdbus) so I think you'd
> > > > > do well to design them in now rather than have some security
> > > > > module hack like me come along later and "fix" it.
> > > >
> > > > Yeah - the thought had occurred to me, but there needs to be some way to
> > > > define a 'source' and a way to connect them. Also, would you want a
> > > > general
> > > > source that anyone can contribute through, specific sources where you
> > > > have
> > > > to
> > > > directly connect or namespace-restricted sources?
> > >
> > > My guess is that the consensus would be "Yes" to all the above.
> > >
> > > > David
> > > >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-security-
> > module" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
>
>