Re: [GIT PULL] kdbus for 4.1-rc1
From: Andy Lutomirski
Date: Tue Apr 14 2015 - 14:57:55 EST
On Tue, Apr 14, 2015 at 10:50 AM, Greg Kroah-Hartman
<gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, Apr 13, 2015 at 02:01:21PM -0700, Andy Lutomirski wrote:
>> On Mon, Apr 13, 2015 at 1:45 PM, Greg Kroah-Hartman
>> <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>> > On Mon, Apr 13, 2015 at 01:13:26PM -0700, Andy Lutomirski wrote:
>> >> On Mon, Apr 13, 2015 at 12:03 PM, Greg Kroah-Hartman
>> >> <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>> >> > The following changes since commit 9eccca0843205f87c00404b663188b88eb248051:
>> >> >
>> >> > Linux 4.0-rc3 (2015-03-08 16:09:09 -0700)
>> >> >
>> >> > are available in the git repository at:
>> >> >
>> >> > git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git/ tags/kdbus-4.1-rc1
>> >> >
>> >> > for you to fetch changes up to 9fb9cd0f4434a23487b6ef3237e733afae90e336:
>> >> >
>> >> > kdbus: avoid the use of struct timespec (2015-04-10 14:34:53 +0200)
>> >> >
>> >> > ----------------------------------------------------------------
>> >> > kdbus for 4.1-rc1
>> >> >
>> >> > Here's the kdbus pull request for 4.1-rc1.
>> >> >
>> >> > It's been under development for many years now, and been in linux-next
>> >> > for many months, and has undergone loads of testing a review and even a few
>> >> > good arguments. It comes with full documentation and tests.
>> >> >
>> >> > There has been a few complaints about the code, notably from people who
>> >> > don't like the use of metadata in the bus messages. That is actually
>> >> > one of the main features here, as we can get this data in a secure and
>> >> > reliable way, and it's something that userspace requires today. So
>> >> > while it does look "odd" to people who are not familiar with dbus, this
>> >> > is something that finally fixes a number of almost unfixable races in
>> >> > the current dbus implementations.
>> >>
>> >> While I generally like the concept of having a better in-kernel IPC
>> >> mechanism, after some consideration I don't think this belongs in the
>> >> kernel in its current form. Here's why.
>> >>
>> >> First, the naming is counterintuitive. There are "endpoints", but you
>> >> don't send messages to endpoints. In fact, an basic kdbus setup will
>> >> have exactly one endpoint AFAICT. Wtf? This makes talking about it
>> >> awkward.
>> >
>> > Did you read the documentation? We've been over this before, and it
>> > should all be addressed in the documentation based on this coming up.
>> >
>> >> A lot of the design seems to be to violate the concept of "mechanism,
>> >> not policy". Kdbus is very much a port of userspace dbus to the
>> >> kernel, and it appears to be a port designed to preserve some
>> >> questionable design decisions instead of learning from them.
>> >>
>> >> For example, kdbus sticks a whole policy database in the kernel, but
>> >> that policy database (AFAICT -- holy crap it's overcomplicated) is
>> >> *not* a simple set of rules like "if A then allow B". Instead it has
>> >> really weird dependencies not on what name you're sending to but on
>> >> what *other* names the thing you're sending to has. Sorry, but this
>> >> way lies (a) the inability for a large set of developers to understand
>> >> what's going on and (b) security bugs. Also, the result probably
>> >> can't be reused as part of a non-legacy-filled sensible design
>> >
>> > What policy database? Matching messages to subscribers? That's the
>> > same type of "database" that other ipc subsystems need/want, there's
>> > nothing radical here.
>>
>> Let me quote from the latest version of the kdbus docs:
>>
>> Note that TALK access is checked against all names of a connection. For
>> example, if a connection owns both <constant>'org.foo.bar'</constant> and
>> <constant>'org.blah.baz'</constant>, and the policy database allows
>> <constant>'org.blah.baz'</constant> to be talked to by WORLD, then this
>> permission is also granted to <constant>'org.foo.bar'</constant>. That
>> might sound illogical, but after all, we allow messages to be directed to
>> either the ID or a well-known name, and policy is applied to the
>> connection, not the name. In other words, the effective TALK policy for a
>> connection is the most permissive of all names the connection owns.
>>
>> In my humble opinion, this paragraph speaks for itself. The design is
>> bad, full stop.
>
> First off, thanks for reading the docs, I appreciate that. But realize
> also, that this is straight from the D-Bus spec. We aren't doing
> anything "radical" here, this is what your desktop uses that you are
> typing your email from.
>
> Yes, it's an unfortunate design, but one that we are all stuck with
> (think of it as having to implement code for horrid hardware that you
> have to get to work properly.)
I agree. You've sent a pull request for an unfortunate design. I
don't think that unfortunate design belongs in the kernel. If it says
in userspace, then user programmers could potentially fix it some day.
> There are many applications out there
> which don't address messages to their well-known name destination but
> to the ID which they looked up earlier and cached. In fact, that
> behavior is the default in the gdbus library implementation.
>
> If a connection owns two names, and one is more permissive than the
> other one, an attacker could as well choose the more openly configured
> name to get a message delivered. That's nothing we can protect from
> really. So ideally you never do that, just like you shouldn't do that
> in an network configuration with DNS, if you want to manage access
> properly.
>
> The logic here is comparable to IP vs. DNS
[snip some]
It's comparable to someone trying to write a firewall that filters on
DNS names. There's a good reason that people don't do that.
[snip]
>>
>> Then I'll have to find a way to embolden my NACK further. My point is
>> that capturing garbage like cmdline and capabilities (again, that
>> latter part is completely unacceptable under any circumstances
>> whatsoever) on behalf of *all* senders is a disaster. If it's
>> optional, then I can at least hope that userspace will honor the
>> optionality and let everything turn it off. If it's mandatory, then
>> kdbus is just unsafe to use to send messages to untrusted parties.
>
> It's opted in by the receiving peer if the task implementing a service
> wants to access these pieces of information. It is optional, and the
> documentation clearly states that userspace should cope with this, and
> also, when they are available we make sure to provide the correct
> race-free information.
>
> As said many times before, an application can do so already today with
> information from other API file systems, so why is this suddenly a
> problem when kdbus optionally offers the exact same information along
> with each transmitted message? Yes, we all "hate" capabilities, but
> userspace uses them, and gets access to them all the time through the
> POSIX apis (capget(), cap_get_pid(), capgetp(), etc.) and through
> /proc/pid/status. They are something that we have to support and handle
> properly.
>
> In the very first submission of kdbus, we stated that we want to allow
> userspace methods to access these same bits to be able to make decisions
> about permissions. And to do so in a race-free manner, which is very
> hard, if not almost impossible, to do so from userspace alone.
>
> For instance, if a task has CAP_NET_ADMIN set, we can use that
> information in order to allow or disallow certain actions to be taken by
> a privileged process. Or, if a client that has the capability to call
> reboot (i.e. have CAP_SYS_REBOOT) makes the D-Bus call to reboot the
> system, the system daemon listening for that message knows that yes, at
> the time that the client made that call, it really did have that
> capability so it is ok to actually reboot the system.
>
> Instead of trying to use SCM_CREDENTIALS to get the pid and another
> round of cap_get_pid() and the like, all of which are susceptable to
> racing and all sorts of other horrors, that are insecure, we can provide
> this information in an atomic, and secure way.
/me suppresses a long string of expletives.
Please point me at the code that does this with caps. It's WRONG in
userspace and it's WRONG in the kernel. I want to know what code that
runs on my system does this so I can send the appropriate bug reports
and get it fixed. I think the RHEL crowd at least will take it
seriously when I tell them that this is a security hole.
>
> The kernel today, and userspace, relies on capabilities all the time
> (i.e. almost every syscall), how are they something that is somehow not
> valid to use and support?
No. The *kernel* relies on caps. Userspace should not.
>
>
> And of course, as Eric will point out, capabailities are not
> translatable across user namespaces, which is a problem. Because of
> this, we dispose of that piece of metadata information when a message
> crosses a user namespace boundry. This is the right thing to do, which
> is not the case for almost all other kernel apis which report bogus
> capabilies when user namespaces are crossed.
The right thing to do is to not use capabilities for userspace stuff.
>
> So we implemented this correctly, and somehow that is a feature so bad
> that both you and Eric think the whole baby should be thrown out? How
> else should this be implemented?
It shouldn't be implemented.
>
> As documented in the original email on this thread, Tizen wants to use
> this, as it solves a real need that they have. Their workarounds
> involve using custom UDS sockets, but the latency involved is horrid and
> unacceptable. Using a kdbus message solves this issue for them,
> allowing UI rendering to work properly/quickly.
>
> Again, capabilities are something we all require and rely on today,
> passing the current capability on to a recipient isn't a way to raise
> privileges at all, but rather, properly determine if they are present
> at sending time, if wanted. How does that create an insecure system?
> What am I missing that is so bad here with the design we have?
That, even if the implementation could be made to be useful and
correct, capabilities refer to privileges wrt the kernel, not
userspace. They're not the right bit of policy to look at here.
For example, the thing that should make it possible to run 'systemctl
reboot' or whatever is not CAP_SYS_BOOT, because CAP_SYS_BOOT is the
permission to hard reboot the system immediately, and that's not what
'systemctl reboot' is for.
I find myself comparing kdbus to win32k, and that's not a good sign...
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/