Re: [PATCH 00/11] RFC: KBUS messaging subsystem

From: James Chapman
Date: Thu Mar 24 2011 - 14:12:42 EST


On 23/03/2011 23:13, Tony Ibbs wrote:
>
> On 22 Mar 2011, at 19:36, Jonathan Corbet wrote:
>
>> On Fri, 18 Mar 2011 17:21:09 +0000
>> Tony Ibbs <tibs@xxxxxxxxxxxxxx> wrote:
>>
>>> KBUS is a lightweight, Linux kernel mediated messaging system,
>>> particularly intended for use in embedded environments.
>
>> - Why kbus over, say, a user-space daemon and unix-domain sockets? I'm
>> not sure I see the advantage that comes with putting this into kernel
>> space.
>
> Mostly, a kernel module gives us reliability.
>
> In particular, a kernel module allows us to guarantee that a replier
> that "goes away" (including crashing) will be detected by KBUS, and
> cause a synthetic reply to be sent, so that the sender can know that it
> will not get a real reply.
>
> This same guarantee means that the sender end of a stateful dialogue can
> be reliably told if the replier end disconnects and (some new version of
> it) reconnects - in which case state presumably needs to be
> reestablished.
>
> Doing this in userspace would be difficult and unreliable.
>
> There are other problems with userspace daemons, including setting up
> many-to-many messaging, message atomicity, and so on. Our past
> experience of other people's solutions (previous customers in
> particular) is that it is perilously easy to get it wrong in userspace,
> and especially to end up with race conditions.

I don't understand what Kbus really brings either.

With good sockets programming, it is possible to avoid most of the
issues mentioned above. Frameworks like Glib and DBus can also help.

Have you considered other kernel messaging subsystems such as netlink
sockets, connectors, POSIX message queues etc etc if you don't want DBus?

>
>> - The interface is ... creative.
>
> That's very tactfully put.
>
>> If you have to do this in kernel space,
>> it would be nice to do away with the split write()/ioctl() API for
>> reading or writing messages. It seems like either a write(), OR an
>> ioctl() with a message data pointer would suffice; that would cut the
>> number of syscalls the applications need to make too.
>
> When the reader is reading a message, using 'read' seems very natural,
> and is simple to explain. Because we always return an "entire" message
> (i.e., one in which all the message data is in one chunk, rather than
> a header pointing to message name and/or data), it also means that
> memory handling on return to user space is much simplified. Doing an
> ioctl first to find out the length of the message to come is also
> simple to explain.

Eh? Network protocols routinely do this sort of thing with regular
sockets. Read the message header then read the rest when you know how
big the rest is. Of where you know the max size of all possible
messages, do one read into a fixed size buffer.

> Also, in the case of reading a message, I can see clear advantage
> in being able to "stream" the reading of the message data (for a
> long and appropriately structured message).
>
> Writing a message *could* be done with 'write' alone. I must admit that
> having 'write' detect the end of the message by looking at it feels
> wrong, somehow, but that's not a very compelling answer. It is,
> however, definitely easier for the user to understand the error if
> they try to <send> and get told they haven't written enough data
> yet, rather than just waiting for the 'write' to magically complete.

A write to a socket would do the same. I don't get the bit about
detecting the end of the message. I think this complexity is coming from
using char devices for message passing.

> There is also a certain symmetry to using <nextmsg>/'read' and
> 'write'/<send>, but as you said at the start, it's a bit unusual.
>
> Using an ioctl instead of 'write' would involve a more complex ioctl
> than we're otherwise commonly using, would lose the symmetry, and just
> didn't feel right. It also means pointer handling for even the simplest
> message.
>
>> Even better might be to just use the socket API.

Agreed.

> Whilst the current API is a bit odd, trying to use the socket API looked
> to us as if it would be a worse fit.
>
> The socket API doesn't seem to match what we wanted KBUS to do
> particularly well. It's not, for instance, obvious how to do a 'recv' of
> a variable length message that might be quite short or several hundred
> KB long - does one 'recv' the header first, and then the body (which
> isn't very nice)? Doing a 'next message' ioctl as current KBUS does
> would feel really alien in a socket environment.
>
> Of course, we'd still have to invent our own addressing scheme, and our
> own ``struct *addr``, and appropriate socket options, and also decide
> how the common options should apply or not (for instance, SO_ACCEPTCONN,
> SO_BROADCAST). And how to work with accept/listen/bind and all the other
> common calls.
>
> Also, lazily on my part, it's fairly obvious how to write a file
> interface for the kernel, but the socket API (from the inside) appears
> to be more complex, and to have fewer examples with training wheels.
>
> We *could* reimplement in terms of sockets, but I think the code would
> get a lot bigger, and I think using the system would be a lot harder to
> explain (I don't think the current message name binding mechanisms would
> get any clearer, for instance).

Why would you need a new socket family?

> And some of the semantics of KBUS (the sending of a message to say that
> the expected replier has been replaced by a new one, for instance) seem
> to fit oddly with how people expect sockets to work. Or being told that
> the far end has gone away, or is not who one expected it to be.

Not really. It is what DBus is all about.

Perhaps KBUS is intended for uses where DBus is too big? Or is it to
help port legacy RTOS apps to Linux? Shudder. :-)

Perhaps I misunderstand what KBUS might do for me. It might be useful to
present two simple apps implementing the same thing with, say, a unix
socket and KBUS, e.g. sending a message reliably to another process and
handling possible errors.

> Also, I'm afraid my experience is that people find sockets hard to
> understand (not necessarily justifiably), whereas explaining KBUS to its
> intended users is fairly simple - one can assume they know about file
> interfaces, and people fairly easily accept a few "odd" extra calls. But
> that may not be a very compelling reason from the inside of the
> kernel...
>
>> - Does anything bound the size of a message fed into the kernel with
>> write()? I couldn't find it. It seems like an application could
>> consume arbitrary amounts of kernel memory.
>
> That is indeed a misfeature. There should be a default limit, and some
> way of changing it.
>
>> - It would be good to use the kernel's dynamic debugging and tracing
>> facilities rather than rolling your own.
>
> Mea culpa. KBUS's debug support grew rather erratically, and only
> recently got converted to at least using dev_debug and friends.
> Also, I'm not at all sure what the current kernel mechanisms are
> (pointers are welcomed, since this is a clear case where normal
> kernel conventions should be followed, and I don't know what they are).
>
>> - There's lots of kmalloc()/memset() pairs that could be kzalloc().
>
> And I just missed that.
>
>> That's as far as I could get for now.
>
> Thanks, it's all appreciated, and all makes sense.
>
> (and I should say thank you since I started out writing KBUS with a copy
> of Linux Device Drivers beside me, and bookmarks for various LWN
> articles. It would all be a lot worse without those).
>
> Hope this all makes sense - it's late here but I shan't have a chance to
> reply tomorrow.
>
> Tibs


--
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/