Re: [GIT PULL] kdbus for 4.1-rc1

From: Havoc Pennington
Date: Thu Apr 16 2015 - 15:02:15 EST


On Thu, Apr 16, 2015 at 9:13 AM, Tom Gundersen <teg@xxxxxxx> wrote:
> All types of messages (unicast and broadcast) are directly stored into
> a pool slice of the receiving connection, and this slice is not reused
> by the kernel until userspace is finished with it and frees it. Hence,
> a client which doesn't process its incoming messages will, at some
> point, run out of pool space. If that happens for unicast messages,
> the sender will get an EXFULL error. If it happens for a multicast
> message, all we can do is drop the message, and tell the receiver how
> many messages have been lost when it issues KDBUS_CMD_RECV the next
> time. There's more on that in kdbus.message(7).
>

Have you guys already grappled with what libraries/apps should do with
this information?

To handle the knowledge that "N messages have been lost," it seems
like the client must answer "are there any messages that, if lost,
would put any code using this connection into a confused state" and
then the client has to recover from said confused state.

A library probably can't do this - it doesn't know what state matters
or how to recover it - so each app would have to... and are
connections ever shared between modules of an app? (for example: could
a library such as GTK+ or pulseaudio be using the connection, and then
application code is also using the connection, so none of those code
modules has the whole picture... at that point, none of the modules
knows what to do about lost messages... to try to handle lost messages
in a module, you'd need a private connection(?)... which might be fine
as long as each app having a number of connections isn't too bloated.)

How to handle a send error depends a lot on what's being sent... but
if I were writing a general-purpose library wrapper, I'd be very
tempted to hide EXFULL behind an unbounded (or very-high-bounded)
userspace send buffer, which of course is what you were trying to
avoid, but I am skeptical that the average app will handle this error
sensibly.

The traditional userspace bus isn't any better than what you've
described here, of course - it's even worse - and it works well
enough. The limits are simply set high enough that they won't be hit
unless someone's broken or evil. Which is also the traditional
approach to say file descriptor limits or swap space: set the limit
high and hope you won't reach it. For the case of the X server, the
limit on message buffers appears to be "until malloc fails," so they
have the limit quite high, higher than userspace dbus does. "set high
limits and don't hit them" is a tried-and-true approach.

With either the existing userspace bus or kdbus, I bet you could come
up with ways to use limit exhaustion to get various services and apps
into confused states as they miss messages they were relying on,
simply because this is too hard for apps to reliably get right. The
lower the limits, the easier it would be to cause trouble by forcing
them to be hit.

In a perfect world we could figure out which client is "at fault" for
filling a buffer - the slow receiver or the overzealous sender - so we
could throttle or disconnect the guilty party instead of throwing
errors that won't be handled well ... but not sure that's practical.

Havoc
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/