Re: [PATCH v11 2/2] binder: report txn errors via generic netlink
From: Carlos Llamas
Date: Wed Jan 08 2025 - 14:08:00 EST
On Tue, Jan 07, 2025 at 04:00:39PM -0800, Li Li wrote:
> On Tue, Jan 7, 2025 at 1:41 PM Carlos Llamas <cmllamas@xxxxxxxxxx> wrote:
> >
> > On Tue, Jan 07, 2025 at 09:29:08PM +0000, Carlos Llamas wrote:
> > > On Wed, Dec 18, 2024 at 12:37:40PM -0800, Li Li wrote:
> > > > From: Li Li <dualli@xxxxxxxxxx>
> > >
> > > > @@ -6137,6 +6264,11 @@ static int binder_release(struct inode *nodp, struct file *filp)
> > > >
> > > > binder_defer_work(proc, BINDER_DEFERRED_RELEASE);
> > > >
> > > > + if (proc->pid == proc->context->report_portid) {
> > > > + proc->context->report_portid = 0;
> > > > + proc->context->report_flags = 0;
> > >
> > > Isn't ->portid the pid from the netlink report manager? How is this ever
> > > going to match a certain proc->pid here? Is this manager supposed to
> > > _also_ open a regular binder fd?
> > >
> > > It seems we are tying the cleanup of the netlink interface to the exit
> > > of the regular binder device, correct? This seems unfortunate as using
> > > the netlink interface should be independent.
> > >
> > > I was playing around with this patch with my own PoC and now I'm stuck:
> > > root@debian:~# ./binder-netlink
> > > ./binder-netlink: nlmsgerr No permission to set flags from 1301: Unknown error -1
> > >
> > > Is there a different way to reset the protid?
> > >
> >
> > Furthermore, this seems to be a problem when the report manager exits
> > without a binder instance, we still think the report is enabled:
> >
> > [ 202.821346] binder: Failed to send binder netlink message to 597: -111
> > [ 202.821421] binder: Failed to send binder netlink message to 597: -111
> > [ 202.821304] binder: Failed to send binder netlink message to 597: -111
> > [ 202.821306] binder: Failed to send binder netlink message to 597: -111
> > [ 202.821387] binder: Failed to send binder netlink message to 597: -111
> > [ 202.821464] binder: Failed to send binder netlink message to 597: -111
> > [ 202.821467] binder: Failed to send binder netlink message to 597: -111
> > [ 202.821344] binder: Failed to send binder netlink message to 597: -111
> > [ 202.822513] binder: Failed to send binder netlink message to 597: -111
> > [ 202.822152] binder: Failed to send binder netlink message to 597: -111
> > [ 202.822683] binder: Failed to send binder netlink message to 597: -111
> > [ 202.822629] binder: Failed to send binder netlink message to 597: -111
>
> As the file path (linux/drivers/android/binder.c) suggested,
> binder driver is designed to work as the essential IPC in the
> Android OS, where binder is used by all system and user apps.
>
> So the binder netlink is designed to be used with binder IPC.
Ok, I assume this decision was made because no better alternative was
found. Otherwise it would be best to avoid the dependency. This could
become an issue e.g. if the admin process was to be split in the future
or some other restructuring happens.
That's why I ask of there is a way to cleanup the netlink info without
relying on the binder fd closing. Something cleaner, there might be some
callback we can install on the netlink infra? I could look later into
this.
> The manager service also uses the binder interface to communicate
> to all other processes. When it exits, the binder file is closed,
> where the netlink interface is reset.
Again, communicating with other processes via binder and setting up a
transaction report should be separate functionalities that don't rely on
eachother.
Also, it seems the admin process would have to initially bind() to all
binder contexts preventing other from doing so? Sound like this should
be restricted to certain capability or maybe via selinux (if possible)
instead of relying on the portid. Thoughts?
--
Carlos Llamas