Re: [PATCH 01/13] kdbus: add documentation

From: Eric W. Biederman
Date: Wed Feb 04 2015 - 01:34:13 EST

Next message: David Woodhouse: "Re: [PATCH] tun: orphan an skb on tx"
Previous message: Stephen Rothwell: "linux-next: build failure after merge of the scsi tree"
In reply to: Greg Kroah-Hartman: "Re: [PATCH 01/13] kdbus: add documentation"
Next in thread: Andy Lutomirski: "Re: [PATCH 01/13] kdbus: add documentation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> writes:

> On Tue, Feb 03, 2015 at 08:47:51PM -0600, Eric W. Biederman wrote:
>> Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes:
>>
>> > On Tue, Feb 3, 2015 at 2:09 AM, Daniel Mack <daniel@xxxxxxxxxx> wrote:
>> >> Hi Andy,
>> >>
>> >> On 02/02/2015 09:12 PM, Andy Lutomirski wrote:
>> >>> On Feb 2, 2015 1:34 AM, "Daniel Mack" <daniel@xxxxxxxxxx> wrote:
>> >>
>> >>>> That's right, but again - if an application wants to gather this kind of
>> >>>> information about tasks it interacts with, it can do so today by looking
>> >>>> at /proc or similar sources. Desktop machines do exactly that already,
>> >>>> and the kernel code executed in such cases very much resembles that in
>> >>>> metadata.c, and is certainly not cheaper. kdbus just makes such
>> >>>> information more accessible when requested. Which information is
>> >>>> collected is defined by bit-masks on both the sender and the receiver
>> >>>> connection, and most applications will effectively only use a very
>> >>>> limited set by default if they go through one of the more high-level
>> >>>> libraries.
>> >>>
>> >>> I should rephrase a bit. Kdbus doesn't require use of send-time
>> >>> metadata. It does, however, strongly encourage it, and it sounds like
>> >>
>> >> On the kernel level, kdbus just *offers* that, just like sockets offer
>> >> SO_PASSCRED. On the userland level, kdbus helps applications get that
>> >> information race-free, easier and faster than they would otherwise.
>> >>
>> >>> systemd and other major users will use send-time metadata. Once that
>> >>> happens, it's ABI (even if it's purely in userspace), and changing it
>> >>> is asking for security holes to pop up. So you'll be mostly stuck
>> >>> with it.
>> >>
>> >> We know we can't break the ABI. At most, we could deprecate item types
>> >> and introduce new ones, but we want to avoid that by all means of
>> >> course. However, I fail to see how that is related to send time
>> >> metadata, or even to kdbus in general, as all ABIs have to be kept stable.
>> >
>> > I should have said it differently. ABI is the wrong term -- it's more
>> > of a protocol issue.
>> >
>> > It looks like, with the current code, the kernel will provide
>> > (optional) send-time metadata, and the sd-bus library will use it.
>> > The result will be that the communication protocol between clients and
>> > udev, systemd, systemd-logind, g-s-d, etc, will likely involve
>> > send-time metadata. This may end up being a bottleneck.
>>
>> A quick note on a couple of things I have seen in this conversation.
>>
>> - The reason for kdbus is performance.
>
> No, that's not the only reason for kdbus, don't focus only on this. I
> set out a long list of things for why we created kdbus, speed was only
> one of the things. Security is also one, and the ability to gather
> these attributes in an atomic and secure way is very important as
> userspace wants this.

Perhaps I should have said the predominant reason. Certainly that seems
to be most of what I have seen talked about.

Regardless looking at the performance in the design and removing any
substantial obstacle to making things go fast.

Further. I had this conversation earlier in an earlier round of the
review and I was told that in fact existing dbus applications do not
want or need these attributes. I think I heard journald wants them for
pretty printing things.

If security is your concern I really think per message attributes
collected and sent when a message is sent is a bad idea. It has been a
nasty anti-pattern in the kernel code. Lots and lots of meta-data
copyed from a task and sent to someone else has significant performance,
maintenance, and security impacts.

Code written in that pattern is complex and hard to analyze, and hard to
think about. Consider debugging why a message does not get the expected
treatment from your suid application because someone changed the euid
over that particular call and had not thought about it's consequences.
Frankly I have been there and done that and it is a mess.

So no I do not think breaking encapsulation and having weird side
effects affecting your new primitive will have any security benefits
whatsover. It will just result in brittle complex code.

If you want to avoid the races causing sends through a file descriptor
to fail that don't have the expected attributes (my constructive
suggestion earlier) is a very different thing from a performance and
mainteance standpoint. That does not increase the code complexity
nearly as much in the implementation or in use, and unexpected failures
happen right away.

>> - pipes rather than unix domain sockets are likely the standard to meet.
>> If you can't equal unix domain sockets for simple things you are
>> likely leaving a lot of stops in. Last I looked pipes in general were
>> notiably faster than unix domain sockets.
>>
>> The performance numbers I saw posted up-thread were horrible. I have
>> seen faster numbers across a network of machines. If your ping-pong
>> latency isn't measured in nano-seconds you are probably doing
>> something wrong.
>
> It all depends on what you are passing on that "ping-pong", a real
> D-Bus connection has real data and meta data that has to be sent.
> Trying to make a fake benchmark number isn't going to show anything.

All that I was intending to convey is that the numbers I have seen have
been orders of magnitude slower than I would expect. And 10x to 100x
slower than the code should be is a reason to ask why.

In my experience being efficient with small messages are important
because (a) they are the hardest to make go fast (b) they are surprising
common. Remote X application start-up times are very slow because of
these.

People have a distressing habit of writing applications that
send a small message and synchronously waits for it. Over time these
small ipc calls build up and you are limited by how fast they will go.

>> - syscalls remove overhead. So since performance is kdbus's reason for existence
>> let's remove some ridiculous stops, and get a fast path into the kernel.
>
> Again, not the only reason, see my first post in this thread for
> details.

But performance is important, and performance is a good reason to use
system calls.

Security is another reason to have real system calls, as there is less
going on (compared to an ioctl multiplexer) so the code is easier to
audit.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: David Woodhouse: "Re: [PATCH] tun: orphan an skb on tx"
Previous message: Stephen Rothwell: "linux-next: build failure after merge of the scsi tree"
In reply to: Greg Kroah-Hartman: "Re: [PATCH 01/13] kdbus: add documentation"
Next in thread: Andy Lutomirski: "Re: [PATCH 01/13] kdbus: add documentation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]