Re: kdbus: to merge or not to merge?

From: Daniel Mack
Date: Thu Aug 06 2015 - 03:06:27 EST


Hi Andy,

On 08/05/2015 02:18 AM, Andy Lutomirski wrote:
> I added the missing sd_bus_unref call.
>
> With userspace dbus, my program takes 95% CPU and dbus-daemon takes
> 88% CPU or so.
>
> With kdbus, I see abuse-bus (my test), systemd-journald,
> systemd-bus-proxy, auditd, gnome-shell, mission-control, sedispatch,
> firewalld, polkitd, NetworkManager, systemd, avahi-daemon, audisp,
> abrt-dump-jour* (whatever it's called -- it truncated), upowerd, and
> systemd-logind all taking tons of CPU. I've listed them in decreasing
> order of amount of CPU burned -- the top several are taking about as
> much as is possible. Load average is over 13. That's if I run it
> from a text console while I'm logged in to gnome in a different VT.

That's right, I can reproduce this here. To explain what's going on, let
me provide some background.

Every time a client connects to kdbus, a new ID is assigned to the
connection, and other connections which have previously subscribed to
notifications of type KDBUS_ITEM_ID_ADD or _REMOVE get a notification
and are woken up so they can dispatch it. By default, no such matches
exists, applicaions have to explicitly opt-in if they are interested in
these events.

In DBus (both kdbus and DBus1), such matches are installed on the
NameOwnerChanged signal, and they can be either specific to a single ID,
or broad, which will make them match on any ID. There's actually no
reason for applications to install unspecific matches, but if they do,
they will of course get what they asked for, and are woken up on every
ID that is added to or removed from the bus. What you're seeing in your
system profile is that some applications misbehave and install
unspecific matches when they shouldn't. That's a userspace bug that
needs fixing. Two candidates were actually in the systemd code base
(logind and PID1), and both are now patched.

Note that these applications are actually affected on both DBus1 and
kdbus. The reason you didn't see them trip up in your test is that
sd_bus_open() behaves differently in the two worlds. In kdbus, it will
immediately call into the kernel and register a new connection, hence
triggering the behavior described above. On DBus1, however, the HELLO
message will not be transmitted to the daemon until the first message is
sent, so no ID is assigned, and no notifications are sent. When
augmenting the test program a little so it reads its own ID on the bus,
you'll see similar behavior on DBus1 as well, but the bottleneck in this
case is the daemon, which significantly mitigates the load caused by
other tasks.

So, to wrap it up: you've triggered an existing userspace bug. The
userspace components under our control have now been fixed, and we'll
talk to other people to make them aware of the issue too. However, these
issues are not directly related to kdbus, but rather show more impact as
a side-effect now.

You've raised a valid point here. Thanks a lot for providing this test,
much appreciated!


Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/