Re: kdbus refactoring?

From: Kalle A. Sandstrom
Date: Thu Nov 12 2015 - 00:44:47 EST



[unrelated quotes trimmed, attribution preserved.]

> >> > On Sun, Nov 8, 2015 at 3:30 PM, Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> >> >> On Sun, Nov 08, 2015 at 10:39:43PM +0100, Richard Weinberger wrote:
> >> >>>
> >> >>> If you rework/redesign something you have to know what you want to change.
> >> >>> That's why I was asking for the plan...
> >> >>
> >> >> Since when do people post "plans" or "design documents" on lkml without
> >> >> real code? Again, code will be posted when it's ready, like any other
> >> >> kernel submission.
> >> >

tl;dr: perhaps they should start doing that.

In the case of kdbus' 4.1 iteration, several of its defects could've been
spotted from its design alone. For examples: the expected userspace
behaviour when clients and servers notice that a message wasn't delivered
(which was underspecified to say the least); difficulty in guaranteeing
forward progress in the face of e.g. surprising scheduling behaviour and the
dropped_msgs field; the O(n) nature of the broadcast filtering bitmap
construct wrt # of connections on the bus[0]; and the feature that permits
opaque falsification of sender credentials by the bus owner[1].

Each of these has a significant real-world impact on designs built atop
kdbus, regardless of whether such things are closer to a layman's
approximation of formal engineering or green-field hack-job proofs of
concept. Literally each must be accounted for in userspace applications that
even as much as breathe in kdbus' direction. I'd have hated to run into the
credential-faking feature if I'd already been sack-deep into a derivative
that relied on the integrity of kdbus' metadata; and as of the most recent
version, the effects of a broadcast storm during high scheduling latency
(load, memory pressure, block device lag, w/e) were still very difficult to
parlay into a predictable design that left no dangling wires.

I don't mean to suggest that the defects cited were due to an incomplete
understanding of kdbus (or indeed IPC) on the part of its authorship.
However there's a very strong argument that these aspects weren't considered
when kdbus was submitted for inclusion, and then given a hard shove.


Moreover, even a semi-formal requirements document would've made reviewing
kdbus much easier without compromising quality of review. As it stood, the
things that would be reasoned about during review had to be sussed out from
kdbus' API documentation, the comments of its developers elsewhere, various
forms of PR surrounding the topic, header files, and from existing knowledge
of things that really must be in there somewhere (e.g. locking).

Similarly knowing the (implicit, patchwork, _anything_ really) arguments why
kdbus' design meets those requirements, how its implementation corresponds
to the design, and how its test suite verifies that the design's properties
are present in the implementation, would've permitted review besides the
"off-road" style which would therefore have been available sooner. That's to
say: there'd have been less of the cranium-oriented demolitions on both
sides of the fence, if any kind or quality of design document had been
available.


Considering that a req spec would've led to a design spec, in turn leading
to impl and test plans, each subject to review, the utility that could've
been had _at 0 SLOC_ would've definitely been significant. Also, their
existence would help manage long-term rot of the implementation and its test
suite by making both unambiguously remediable where rot's effects were
discovered[2]. Further progress could be built on that foundation instead of
hacks upon hacks, ever-mounting technical debt, and eventual CADT.

For instance: who's had a poke at Linux mm in the past two years? Or the
scheduler? Who even could, and where would they start? Both appear as
interlocking mishmashes of subtle oft-historical concerns ranging from the
humdrum to "must have been employed at SGI in the early aughties to
understand" grade NUMA, making each alteration unverifiable outside of test
environments the hacker has access to -- i.e. VMs and maybe a handful of
off-the-shelf microcomputers. IIRC the last major scheduling change was a
nigh-complete rewrite that had CFS emerge from failings indicated by Con
Kolivas' interactivity work.

I'm sure there's people who're well savvy to mm, sched, and maybe even both
at once. To the rest of us it might as well be opaque as far as
non-regressive modification is concerned. kdbus is certainly big enough to
suffer a similar fate, given time.


On Mon, Nov 09, 2015 at 09:23:34AM -0800, Andy Lutomirski wrote:
> On Mon, Nov 9, 2015 at 9:07 AM, Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > On Mon, Nov 09, 2015 at 05:02:45PM +0000, Måns Rullgård wrote:
> >> Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes:
> >>
[quote moved up top]
> >> > I ask for feedback on ideas and designs on a fairly regular basis. I
> >> > even frequently get valuable feedback :)
> >> >
> >> > I would like to think that the kernel community would have something
> >> > of value to add to the process of designing and implementing a major
> >> > new IPC mechanism.
> >>
> >> The "trust us, we'll show it when it's ready" attitude reminds me of the
> >> controversial TPP and TTIP negotiations.
> >
> > Ok, that's just trolling, cut it out.
> >
> > When something is even in the "hey look, it works, here's the big
> > changes from last time", we will of course post it, but right now,
> > things are being totally revisited based on the feedback we have
> > received so far. Give people a chance to recover from conferences and
> > then get back to work...
> >
>
> I hate to say this, but this approach to receiving feedback makes me
> really dislike the process.
>
> I read a fairly large fraction of the kdbus code. I found what I
> perceived to be issues, and I spoke up. I was told for quite a while
> that the authors disagreed that the issues I found were issues and
> that my assessment of the security aspects of the code was correct.
>
> Now the submission has been withdrawn (because of feedback received so
> far? from me?) and there will apparently be a new submission out of
> the blue, allegedly based on feedback.
>

This serves to discourage my review as well. What do we have to go on
besides diffs? The submitter (and his/her organization) is privy to all the
knowledge on what changed, and we are not. There's not even clarity on what
parts of review so far is being accounted for.

It's like the KHTML guys and A###e's source code dumps: sync up all over
again or flippity-flapping fudge off.


[rest of Andy's post snipped. hopefully I got the CC's right this time
around.]


-KS


[0] an alternative design would associate every bus-connection with a number
of arbitrary 32-bit identifiers and filter broadcast messages according to a
disjunction of conjunctions of their presence, giving exact filtering and
time complexity linear to # of connections that genuinely need to receive a
given message. With a bit of storage cleverness, the required merge/uniq
algorithms could be implemented in a branch-free fashion, yielding extreme
micro performance and a predictable upper bound for space during
filter-clause evaluation.
[1] IPC mechanisms exist that permit sender-faking ("propagation")
transparently. Some allow fake receivers ("redirection") as well. Whether
either is necessary under monolithic Unix is anybody's guess.
[2] as for documentation rot, no comment.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/