Re: [GIT PULL] kdbus for 4.1-rc1

From: David Lang
Date: Wed Apr 29 2015 - 16:16:33 EST


On Wed, 29 Apr 2015, Andy Lutomirski wrote:

On Wed, Apr 29, 2015 at 12:30 PM, Austin S Hemmelgarn
<ahferroin7@xxxxxxxxx> wrote:
On 2015-04-29 14:54, Andy Lutomirski wrote:

On Apr 29, 2015 5:48 AM, "Harald Hoyer" <harald@xxxxxxxxxx> wrote:


* Being in the kernel closes a lot of races which can't be fixed with
the current userspace solutions. For example, with kdbus, there is a
way a client can disconnect from a bus, but do so only if no further
messages present in its queue, which is crucial for implementing
race-free "exit-on-idle" services


This can be implemented in userspace.

Client to dbus daemon: may I exit now?
Dbus daemon to client: yes (and no more messages) or no

Depending on how this is implemented, there would be a potential issue if a
message arrived for the client after the daemon told it it could exit, but
before it finished shutdown, in which case the message might get lost.


Then implement it the right way? The client sends some kind of
sequence number with its request.

so any app in the system can prevent any other app from exiting/restarting by just sending it the equivalent of a ping over dbus?

preventing an app from exiting because there are unhandled messages doesn't mean that those messages are going to be handled, just that they will get read and dropped on the floor by an app trying to exit. Sometimes you will just end up with a hung app that can't process messages and needs to be restarted, but can't be restarted because there are pending messages.

The problem with "guaranteed delivery" messages is that things _will_ go wrong that will cause the messages to not be received and processed. At that point you have the choice of loosing some messages or freezing your entire system (you can buffer them for some time, but eventually you will run out of buffer space)

We see this all the time in the logging world, people configure their systems for reliable delivery of log messages to a remote machine, then when that remote machine goes down and can't receive messages (or a network issue blocks the traffic), the sending machine blocks and causes an outage.

Being too strict about guaranteeing delivery just doesn't work. You must have a mechanism to abort and throw away unprocessed messages. If this means disconnecting the receiver so that there are no missing messages to the receiver, that's a valid choice. But preventing a receiver from exiting because it hasn't processed a message is not a valid choice.

David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/