Re: [PATCH v3 1/6] ANDROID: binder: add support for RT prio inheritance.
From: Martijn Coenen
Date: Fri Nov 17 2017 - 05:23:49 EST
On Thu, Nov 16, 2017 at 4:10 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> Well, I go by the one described in all the real-time computing texts;
> also found on Wikipedia FWIW:
>
> https://en.wikipedia.org/wiki/Priority_inheritance
Guess I was taking inheritance too literally :-)
>
>> This behavior is also related to binder's threadpool - all those
>> threads run at a default prio of 120 (nice 0), and call into the
>> kernel to wait for work. If somebody then makes a call with a lower
>> prio (>120), we do change the binder thread that handles that call to
>> use the same low prio as well. Why run at nice 0 if the caller runs at
>> nice 10?
>
> Not immediately saying it doesn't make sense in the sync-rpc scenario;
> just saying that calling it PI is terribly confusing.
>
>> Yes, in case of asynchronous transactions it's not PI at all.
>
> It might be good to untangle some of that.
Yeah, I could possibly separate the part that deals with synchronous
transactions out, since that is "true PI".
> I'm still struggling with the fact that _binder_ has threads and not the
> apps themselves. Totally weird that.
Just to clarify - the threads do belong to the app; they are spawned
from the app process in userspace (sometimes on request of the
kernel), and call into the kernel driver for work. But, the app itself
is merely responsible for starting the threadpool; the libbinder
userspace library takes care of the rest. Beyond that, apps just
receive incoming calls on one of these threads, and that's it.
> Right; but real-time systems are about guarantees, not about mostly and
> on average.
Agreed. I wouldn't go so far to claim Android is RT, but we'll take
any gains we can get.
> But yes, you touch upon one way to have mixed priority thread pools. You
> need a max prio receiver thread that is basically always runnable, this
> is the one thread doing poll(). It is the thread you can immediately
> assign your rt_mutex ownership to etc..
This is an interesting design, I'll think about if I can make it work
with synchronous transactions. One concern is that running such a
thread at max prio means it would always preempt other work,
regardless of the importance of the incoming binder transaction.
Though I guess if you assign rt_mutex ownership to that thread before
even waking it up, it doesn't need to be max prio - it will be at the
prio it needs to be at.
Also I'm not sure how to make this work with async transactions that
need to be handled at a certain prio; either we already have a thread
lying around running at that prio, but if we don't we need to change
the prio before waking it up (which is more or less what this series
does).
> Sounds really weird to me though; I'd be curious to know why this
> 'feature' was created.
I think it may be because binder tries to abstract away from the fact
that a node can be remote - for userspace a binder object can be both
local and remote, and it shouldn't have to care. If you have a local
function call chain A() -> B() -> C(), then these functions obviously
all run on the same thread. But if A() and C() live in process P1 and
B lives in process P2, having A() and C() run in the same thread in
process P1 preserves those semantics. Arve (on CC) probably knows more
about this.
> I'm taking it changing this stuff is 'difficult' since much of it has
> been directly exposed to apps? And you'll need to build a parallel
> interface and slowly migrate apps away from it?
I think the whole threadpool design is actually abstracted pretty
nicely from apps, but we do have transaction ordering semantics that
we need to preserve; also the binder protocol between the kernel
driver and userspace is UAPI, and it can be hard to make large changes
to it while retaining support for an old Android userspace.
> Yeah I suppose so. Also I think the comments could be made clearer to
> avoid some of the confusion we've had here.
A lot of what we talked about is also about how binder works in
general. I'm writing more documentation about that internally, and I
hope I could some day add Documentation/ipc/binder.txt or something
like that :-)