Re: [PATCH v11 2/2] binder: report txn errors via generic netlink
From: Carlos Llamas
Date: Thu Jan 09 2025 - 13:52:15 EST
On Tue, Jan 07, 2025 at 04:00:39PM -0800, Li Li wrote:
> On Tue, Jan 7, 2025 at 1:41 PM Carlos Llamas <cmllamas@xxxxxxxxxx> wrote:
> >
> > On Tue, Jan 07, 2025 at 09:29:08PM +0000, Carlos Llamas wrote:
> > > On Wed, Dec 18, 2024 at 12:37:40PM -0800, Li Li wrote:
> > > > From: Li Li <dualli@xxxxxxxxxx>
> > >
> > > > @@ -6137,6 +6264,11 @@ static int binder_release(struct inode *nodp, struct file *filp)
> > > >
> > > > binder_defer_work(proc, BINDER_DEFERRED_RELEASE);
> > > >
> > > > + if (proc->pid == proc->context->report_portid) {
> > > > + proc->context->report_portid = 0;
> > > > + proc->context->report_flags = 0;
> > >
> > > Isn't ->portid the pid from the netlink report manager? How is this ever
> > > going to match a certain proc->pid here? Is this manager supposed to
> > > _also_ open a regular binder fd?
> > >
> > > It seems we are tying the cleanup of the netlink interface to the exit
> > > of the regular binder device, correct? This seems unfortunate as using
> > > the netlink interface should be independent.
> > >
> > > I was playing around with this patch with my own PoC and now I'm stuck:
> > > root@debian:~# ./binder-netlink
> > > ./binder-netlink: nlmsgerr No permission to set flags from 1301: Unknown error -1
> > >
> > > Is there a different way to reset the protid?
> > >
> >
> > Furthermore, this seems to be a problem when the report manager exits
> > without a binder instance, we still think the report is enabled:
> >
> > [ 202.821346] binder: Failed to send binder netlink message to 597: -111
> > [ 202.821421] binder: Failed to send binder netlink message to 597: -111
> > [ 202.821304] binder: Failed to send binder netlink message to 597: -111
> > [ 202.821306] binder: Failed to send binder netlink message to 597: -111
> > [ 202.821387] binder: Failed to send binder netlink message to 597: -111
> > [ 202.821464] binder: Failed to send binder netlink message to 597: -111
> > [ 202.821467] binder: Failed to send binder netlink message to 597: -111
> > [ 202.821344] binder: Failed to send binder netlink message to 597: -111
> > [ 202.822513] binder: Failed to send binder netlink message to 597: -111
> > [ 202.822152] binder: Failed to send binder netlink message to 597: -111
> > [ 202.822683] binder: Failed to send binder netlink message to 597: -111
> > [ 202.822629] binder: Failed to send binder netlink message to 597: -111
>
> As the file path (linux/drivers/android/binder.c) suggested,
> binder driver is designed to work as the essential IPC in the
> Android OS, where binder is used by all system and user apps.
>
> So the binder netlink is designed to be used with binder IPC.
>
> The manager service also uses the binder interface to communicate
> to all other processes. When it exits, the binder file is closed,
> where the netlink interface is reset.
Did you happen to look into netlink_register_notifier()? That seems like
an option to keep the device vs netlink socket interface from mixing up.
I believe we could check for NETLINK_URELEASE events and do the cleanup
then. I'll do a quick try.