Re: [PATCH 2/2] net: Implement SO_PASSCGROUP to enable passing cgroup path

From: Andy Lutomirski
Date: Wed Apr 16 2014 - 10:36:04 EST


On Wed, Apr 16, 2014 at 3:17 AM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> On Tue, Apr 15, 2014 at 08:47:54PM -0700, Andy Lutomirski wrote:
>> On Apr 15, 2014 5:20 PM, "Vivek Goyal" <vgoyal@xxxxxxxxxx> wrote:
>> >
>> > On Tue, Apr 15, 2014 at 02:53:13PM -0700, Andy Lutomirski wrote:
>> > > On Tue, Apr 15, 2014 at 2:15 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
>> > > > This patch implements socket option SO_PASSCGROUP along the lines of
>> > > > SO_PASSCRED.
>> > > >
>> > > > If SO_PASSCGROUP is set, then recvmsg() will get a control message
>> > > > SCM_CGROUP which will contain the cgroup path of sender. This cgroup
>> > > > belongs to first mounted hierarchy in the sytem.
>> > > >
>> > > > SCM_CGROUP control message can only be received and sender can not send
>> > > > a SCM_CGROUP message. Kernel automatically generates one if receiver
>> > > > chooses to receive one.
>> > > >
>> > > > This works both for unix stream and datagram sockets.
>> > > >
>> > > > cgroup information is passed only if either the sender or receiver has
>> > > > SO_PASSCGROUP option set. This means for existing workloads they should
>> > > > not see any significant performance impact of this change.
>> > >
>> > > This is odd. Shouldn't an SCM_CGROUP cmsg be generated when the
>> > > receiver has SO_PASSCGROUP set and the sender passes SCM_CGROUP to
>> > > sendmsg?
>> >
>> > How can receiver trust the cgroup info generated by sender. It needs to
>> > be generated by kernel so that receiver can trust it.
>> >
>> > And if receiver needs to know cgroup of sender, receiver can just set
>> > SO_PASSCGROUP on socket and receiver should get one SCM_CGROUP message
>> > with each message received.
>>
>> I think the kernel should validate the data.
>>
>> Here's an attack against SO_PEERCGROUP: if you create a container with
>> a super secret name, then every time you connect to any unix socket,
>> you leak the name.
>
> One should be able to do that already today with SO_PASSCRED option and
> then map pid to cgroup. Or if one is using user namespaces then go
> through uid mappings and figure out which container sent message.

Not if you've locked down proc, perhaps by using hidepid.

>
>>
>> Here's an attack against SO_PASSCGROUP, as you implemented it: connect
>> a socket and get someone else to write(2) to it. This isn't very
>> hard. Now you've impersonated.
>
> If you can get another process to write to your socket and impersonate,
> then what will stop from that process to also send SCM_CGROUP message
> also? So I don't see how SCM_CGROUP from client will solve this problem.
>

I can easily get other processed to write to my socket. Passing that
socket as stderr to a setuid program is probably the easiest way.
Finding a service that accepts a socket using SCM_RIGHTS and writes to
it is another. It is supposed to be safe to write(2) to an untrusted
file descriptor, or, at the very least, it is supposed to be a DoS at
worst. In this case, it's also either an information leak.

It's true that SO_PASSCRED has the same problem. I consider that to
be a mistake, and I suspect that there are a large number of
longstanding security problems caused by it. Regardless, we shouldn't
exacerbate this problem. There is no legacy code using SCM_CGROUP at
all right now, because the option has never been in a released kernel.
So let's get the interface right the first time around.

If I find some time later today, I can try to write a variant of the
patch that only sends SCM_CGROUP when the sender requests it.

On an unrelated note, what happens when there are multiple cgroup hierarchies?

> Kernel cgroup verification will also not help in this case as sender
> is sending his own cgroup.

Sure it will -- the sender sticks a string into SCM_CGROUP and the
kernel checks it.

>
>>
>> I advocate for the following semantics: if sendmsg is passed a
>> SCM_CGROUP cmsg, and that cmsg has the right cgroup, and the receiver
>> has SO_PASSCGROUP set, then the receiver gets SCM_CGROUP. If you try
>> to lie using SCM_CGROUP, you get -EPERM. If you set SO_PASSCGROUP,
>> but your peer doesn't sent SCM_CREDS, you get nothing.
>>
>> This is immune to both attacks. It should be cheaper, too, since
>> there's no overhead for people who don't use it.
>
> I think you seem to be saying that a client's credentials should not be
> visible to receiver until and unless client himself wants to reveal
> those. IOW, it kind of looks like an anonymous mode of operation where
> client connects to a socket but receiver client not want to reveal any of
> the information about itself to receiver.
>
> I am not sure how useful that mode really is. If it is really useful, I
> think one could implement another socket option on client side to
> deny passing cgroup information to receiver. Say SO_NOPASSCGROUP.

This won't help -- an attacker will simply not set that option, and
the program being attacked is certainly not going to set
SO_NOPASSCGROUP right before calling write.

>
> Before we even get there, I will question that what's so secret about
> cgroup information that one would like to hide it from receiver. We don't
> hide uid, pid, gid.
>

I think we should hide uid, gid, and pid too, but that ship has sailed.

The bigger issue isn't hiding so much as accidental assertions of
authority. If a program accidentally leaks its uid, that's one thing.
If a program, by accidentally leaking its uid, causes another program
to think that the sender wants some action to be taken on behalf of
its uid, then there's a real security problem.

> Secondly, how would client know when to send SCM_CGROUP to receiver. For
> the use case I mentioned that init wants to log cgroup of every message
> going into journal. How would client know that every message needs to
> have SCM_CGROUP. By automatically getting client information when receiver
> needs it, simplifies the things a lot without any client modificaiton.

I think that the client *should* be modified. What if there's an
existing program that runs as a container's root but does not intend
to sign off on a message with its cgroup?

In any event, I still think that the journald case has no need for any
kernel changes at all. From a very cursory inspection, the journal
code expects to find a socket in /run/systemd/journal/socket. It
should be enough to stick a different socket into that location in
each container. This will work on all kernels and may even work
without modifying any code in the container.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/