Re: [GIT PULL] kdbus for 4.1-rc1

From: Greg Kroah-Hartman
Date: Tue Apr 14 2015 - 13:50:34 EST

On Mon, Apr 13, 2015 at 02:01:21PM -0700, Andy Lutomirski wrote:
> On Mon, Apr 13, 2015 at 1:45 PM, Greg Kroah-Hartman
> <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > On Mon, Apr 13, 2015 at 01:13:26PM -0700, Andy Lutomirski wrote:
> >> On Mon, Apr 13, 2015 at 12:03 PM, Greg Kroah-Hartman
> >> <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> >> > The following changes since commit 9eccca0843205f87c00404b663188b88eb248051:
> >> >
> >> > Linux 4.0-rc3 (2015-03-08 16:09:09 -0700)
> >> >
> >> > are available in the git repository at:
> >> >
> >> > git:// tags/kdbus-4.1-rc1
> >> >
> >> > for you to fetch changes up to 9fb9cd0f4434a23487b6ef3237e733afae90e336:
> >> >
> >> > kdbus: avoid the use of struct timespec (2015-04-10 14:34:53 +0200)
> >> >
> >> > ----------------------------------------------------------------
> >> > kdbus for 4.1-rc1
> >> >
> >> > Here's the kdbus pull request for 4.1-rc1.
> >> >
> >> > It's been under development for many years now, and been in linux-next
> >> > for many months, and has undergone loads of testing a review and even a few
> >> > good arguments. It comes with full documentation and tests.
> >> >
> >> > There has been a few complaints about the code, notably from people who
> >> > don't like the use of metadata in the bus messages. That is actually
> >> > one of the main features here, as we can get this data in a secure and
> >> > reliable way, and it's something that userspace requires today. So
> >> > while it does look "odd" to people who are not familiar with dbus, this
> >> > is something that finally fixes a number of almost unfixable races in
> >> > the current dbus implementations.
> >>
> >> While I generally like the concept of having a better in-kernel IPC
> >> mechanism, after some consideration I don't think this belongs in the
> >> kernel in its current form. Here's why.
> >>
> >> First, the naming is counterintuitive. There are "endpoints", but you
> >> don't send messages to endpoints. In fact, an basic kdbus setup will
> >> have exactly one endpoint AFAICT. Wtf? This makes talking about it
> >> awkward.
> >
> > Did you read the documentation? We've been over this before, and it
> > should all be addressed in the documentation based on this coming up.
> >
> >> A lot of the design seems to be to violate the concept of "mechanism,
> >> not policy". Kdbus is very much a port of userspace dbus to the
> >> kernel, and it appears to be a port designed to preserve some
> >> questionable design decisions instead of learning from them.
> >>
> >> For example, kdbus sticks a whole policy database in the kernel, but
> >> that policy database (AFAICT -- holy crap it's overcomplicated) is
> >> *not* a simple set of rules like "if A then allow B". Instead it has
> >> really weird dependencies not on what name you're sending to but on
> >> what *other* names the thing you're sending to has. Sorry, but this
> >> way lies (a) the inability for a large set of developers to understand
> >> what's going on and (b) security bugs. Also, the result probably
> >> can't be reused as part of a non-legacy-filled sensible design
> >
> > What policy database? Matching messages to subscribers? That's the
> > same type of "database" that other ipc subsystems need/want, there's
> > nothing radical here.
> Let me quote from the latest version of the kdbus docs:
> Note that TALK access is checked against all names of a connection. For
> example, if a connection owns both <constant>''</constant> and
> <constant>'org.blah.baz'</constant>, and the policy database allows
> <constant>'org.blah.baz'</constant> to be talked to by WORLD, then this
> permission is also granted to <constant>''</constant>. That
> might sound illogical, but after all, we allow messages to be directed to
> either the ID or a well-known name, and policy is applied to the
> connection, not the name. In other words, the effective TALK policy for a
> connection is the most permissive of all names the connection owns.
> In my humble opinion, this paragraph speaks for itself. The design is
> bad, full stop.

First off, thanks for reading the docs, I appreciate that. But realize
also, that this is straight from the D-Bus spec. We aren't doing
anything "radical" here, this is what your desktop uses that you are
typing your email from.

Yes, it's an unfortunate design, but one that we are all stuck with
(think of it as having to implement code for horrid hardware that you
have to get to work properly.) There are many applications out there
which don't address messages to their well-known name destination but
to the ID which they looked up earlier and cached. In fact, that
behavior is the default in the gdbus library implementation.

If a connection owns two names, and one is more permissive than the 
other one, an attacker could as well choose the more openly configured 
name to get a message delivered. That's nothing we can protect from 
really. So ideally you never do that, just like you shouldn't do that
in an network configuration with DNS, if you want to manage access

The logic here is comparable to IP vs. DNS
- A host may have multiple DNS names assigned, just like a service may
be the owner of multiple well-known names
- Clients can talk to a service using its unique ID (uint64_t) or its
well known name.
- Clients can as well look up the ID of a well-known name and address
messages to it directly
- Hence, we cannot make decisions based on the well-known name that has
been used to send the message
- Instead, we have to fall back to the logic described in the docs
- Firewall rules are applied to IPs, _not_ DNS names!

D-Bus is a specification that has been out there for over a decade, and
we are not designing anything new here, but rather implementing it as
designed. We have to be compatible to the existing users of the DBus
system, and don't have the luxury of being able to change core things
like this and expect the world to be able to change just because the
design is not as clean as it should/could be.

Again, just like getting horrid hardware to work properly, sometimes we
have to write odd code. Or having to implement a network protocol that
doesn't seem to be designed "perfectly", yet is used by a few hundred
million systems so we have to remain compatible. This is all that we
are doing here for stuff like this.

Remember, this is called kDBUS, not kGENERICIPC, no matter how much we
would have liked that to happen from a kernel standpoint. :)

> > And the benchmarks and source were posted by David previously, with full
> > details, this is the first time I've heard you could not reproduce them
> > using that code.
> No it's not. But I got bored and didn't try again.

Sorry, I was not aware of that.

> >> The metadata thing is problematic. It seems to be intended to serve
> >> two purposes: data gathering for logging and authentication.

You forgot about introspection, more on that below.

> >> Unfortunately, it has issues. There are no fewer than *three*
> >> metadata capture points: creation of a bus, connection to a bus, and
> >> sending of a message. The kdbus authors like to point out that these
> >> are all optional, but IMO that's bunk. Someone will write a userspace
> >> library that rejects messages from people who don't enable all of
> >> them, then then we're screwed.
> >
> > Remember, you asked for it to be optional, it wasn't in the beginning :)
> >
> > So let's make it not optional, great. And the capture points are in
> > different places as it is different data and entry points.
> Then I'll have to find a way to embolden my NACK further. My point is
> that capturing garbage like cmdline and capabilities (again, that
> latter part is completely unacceptable under any circumstances
> whatsoever) on behalf of *all* senders is a disaster. If it's
> optional, then I can at least hope that userspace will honor the
> optionality and let everything turn it off. If it's mandatory, then
> kdbus is just unsafe to use to send messages to untrusted parties.

It's opted in by the receiving peer if the task implementing a service
wants to access these pieces of information. It is optional, and the
documentation clearly states that userspace should cope with this, and
also, when they are available we make sure to provide the correct
race-free information.

As said many times before, an application can do so already today with
information from other API file systems, so why is this suddenly a
problem when kdbus optionally offers the exact same information along
with each transmitted message? Yes, we all "hate" capabilities, but
userspace uses them, and gets access to them all the time through the
POSIX apis (capget(), cap_get_pid(), capgetp(), etc.) and through
/proc/pid/status. They are something that we have to support and handle

In the very first submission of kdbus, we stated that we want to allow
userspace methods to access these same bits to be able to make decisions
about permissions. And to do so in a race-free manner, which is very
hard, if not almost impossible, to do so from userspace alone.

For instance, if a task has CAP_NET_ADMIN set, we can use that
information in order to allow or disallow certain actions to be taken by
a privileged process. Or, if a client that has the capability to call
reboot (i.e. have CAP_SYS_REBOOT) makes the D-Bus call to reboot the
system, the system daemon listening for that message knows that yes, at
the time that the client made that call, it really did have that
capability so it is ok to actually reboot the system.

Instead of trying to use SCM_CREDENTIALS to get the pid and another
round of cap_get_pid() and the like, all of which are susceptable to
racing and all sorts of other horrors, that are insecure, we can provide
this information in an atomic, and secure way.

The kernel today, and userspace, relies on capabilities all the time
(i.e. almost every syscall), how are they something that is somehow not
valid to use and support?

And of course, as Eric will point out, capabailities are not
translatable across user namespaces, which is a problem. Because of
this, we dispose of that piece of metadata information when a message
crosses a user namespace boundry. This is the right thing to do, which
is not the case for almost all other kernel apis which report bogus
capabilies when user namespaces are crossed.

So we implemented this correctly, and somehow that is a feature so bad
that both you and Eric think the whole baby should be thrown out? How
else should this be implemented?

As for the command line information, yes, it is "unsafe", and we clearly
state taht in the documentation. However, it is still a very valid
piece of information. For example, when a service is activated by a
method, getting to know which binary caused that to happen is very
usefull when debugging. It's also very useful when debugging multi-call
binaries because the command line actually tells you argv[0] correctly.

Because of this, that's why lots of userspace tools use the command line
information today, again, providing that information is a help to them,
why wouldn't we provide them that help when we have access to it?

Metadata attachment has always been optional, based on the setting of
the receiving peer, but we have added, at your request, the ability to
globally limit what kdbus is able to transport for that metadata,
regardless of the settings on both sides. It sounds like this option
isn't liked, and I'll be glad to revert it as I do think the metadata is
useful and wanted.

> >> Why are we screwed? Because any kdbus client *won't know which
> >> metadata matters*. That means that we automatically have the worst of
> >> all worlds, not the best. Also, the bus creation metadata is
> >> completely worthless for anything other than logging, but someone will
> >> use it for something other than logging, at which point it's
> >> vulnerable to a DoS. No one has explained to my satisfaction why this
> >> isn't a problem.

Metadata is gathered for logging, authentication and introspection. Bus
creator metadata is not used for logging or authentication, but for
introspection only. It could be really useful for a service that has a
bus handle to actually know which bus it is connected to, but it's not
supposed to be used as authentication measure. So I was wrong to think
that the SELinux people use it, sorry about that.

Remember there are three different places that metadata is collected,
for three different things. Yeah, we call them all "metadata", which is
probably why the confusion here, but these all are different "things"
entirely, and the documentation does describe this really well. If not,
please let me know and I will work on it to make it more clear.

The important point here is that you cannot look at this concept without
keeping the dbus spec in scope. Nobody is supposed to write native kdbus
clients directly. you can, of course, but the entire concept of how
services are implemented follows a higer-level logic which is supposed
to be implemented by high-level libraries.

Yes, this isn't the best argument for why you might feel more
comforatable about merging this code, as us kernel developers are used
to stand-alone apis that they can use without library helpers, but it is
common and needed. But really, when was the last time you wrote an ALSA
library from scratch? :)

Again, remember the compatibility requirements for your userspace D-Bus
clients today, we have to ensure this, or this code is pointless.

A word about introspection. In talking about this with Daniel on IRC
today, he came up with this good example to explain it better to me, as
I didn't quite understand it well. I'll paraphrase it here, keeping
with the "bus" metaphor that D-Bus requires:

Imagine we're all taking a little tour, out to the nature, a
lake or something. We're taking a bus to get there. The bus can
accommodate a large number of people, and we don't know yet who
will join. Everybody who enters the bus has to show their
passport to the conductor (refrain from calling it driver,
because hell no, it clearly isn't a driver!! ;)).

The conductor makes a copy of each of the people entering the
bus, because it wants to know who's on the bus. One property of
that strange bunch of programmers on the bus is that they don't
necessarily respond to anything, but whenever anyone in the bus
talks to another person, they show their passport in order to
identify themselves, because you know you can't trust anyone.

Next, the police stops the bus and wants to know who's on it.
As the programmers usually don't respond when being spoken to,
especially if it is the police, the bus conductor hands out a
list of all the passport copies he gathered. That is called
introspection that is not backed by cooperative bus members.
The conductor makes a copy of each OF THE PASSPORT of the people
entering the bus, to help the police (i.e. debuggers) determine
who is on the bus.

It is a property of the bus itself which describes which
personal data you have to give to the conductor in order to be
allowed in. If you're not willing to give out all the bits the
bus requires you to, you have to stay out. That's not a problem
of the system, but rather something to discuss with the owner of
the bus. This way, it is totally possible to have a bus that
does not require anything from its passengers, and passengers
that do not allow any personal information to be revealed, but
then the police can't do much of course when it stops the bus in
order to introspect it.

Then there is a set of global laws in that world in which all
the busses live. These laws define which data is allowed to be
passed around at all in general. When a bus requires its
passengers to reveal their hair color, for instance, but passing
that information around is forbidden by global law. This
requirement is ignored when buses are created or anyone enters
any of those buses.

And to complete the story and outline the differences of the
passports that were used to make a copy from and the one that is
used during communication, we'd have to a add story about people
changing their hair color constantly in the washroom on the back
of the bus, out of sight of the conductor, but this metaphor is
getting quite long enough already...

Does that help explain introspection and the need for it here?

> >> Also, the metadata code captures things that are, in my book
> >> completely unacceptable, such as cmdline and (!) capabilities. I bet
> >> that the cmdline capture is extra special fscked up when cgroups and
> >> such are in play because *it reads from the sender's VM*. IOW it's
> >> insecure and pointless. (OK, it has a point: logging. But I really
> >> don't think that belongs in the kernel.)
> >
> > The sender's vm is what is wanted here. And cmdline is something that
> > userspace gets today, and does things with, as does SELinux, and
> > auditing. Same for capabilities, it's not insecure and pointless, it's
> > the same thing that is provided to userspace, and userspace makes
> > decisions on today, independent of kdbus/dbus.
> Is there anything that userspace makes decisions on based on
> capabilities? If so, please tell me and I'll entertain myself by
> writing exploits for them.
> The fact that some existing userspace does awful things does *not*
> justify adding new kernel mechanisms with which to repeat those
> mistakes.

polkit used to do something like this, but the obvious race conditions
that you know about prevented it from working properly, so other odd
work-arounds had to be created. However, if we can provide this in a
race free manner, those work-arounds are no longer needed.

As documented in the original email on this thread, Tizen wants to use
this, as it solves a real need that they have. Their workarounds
involve using custom UDS sockets, but the latency involved is horrid and
unacceptable. Using a kdbus message solves this issue for them,
allowing UI rendering to work properly/quickly.

Again, capabilities are something we all require and rely on today,
passing the current capability on to a recipient isn't a way to raise
privileges at all, but rather, properly determine if they are present
at sending time, if wanted. How does that create an insecure system?
What am I missing that is so bad here with the design we have?


greg k-h
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at