Re: [PATCH net-next v9 1/1] binder: report txn errors via generic netlink

From: Carlos Llamas
Date: Mon Dec 09 2024 - 22:18:16 EST


On Mon, Dec 09, 2024 at 05:26:14PM -0800, Li Li wrote:
> On Mon, Dec 9, 2024 at 4:44 PM Carlos Llamas <cmllamas@xxxxxxxxxx> wrote:
> >
> > On Mon, Dec 09, 2024 at 11:22:47AM -0800, Li Li wrote:
> > > From: Li Li <dualli@xxxxxxxxxx>
> > >
> > > Frozen tasks can't process binder transactions, so sync binder
> > > transactions will fail with BR_FROZEN_REPLY and async binder
> > > transactions will be queued in the kernel async binder buffer.
> > > As these queued async transactions accumulates over time, the async
> > > buffer will eventually be running out, denying all new transactions
> > > after that with BR_FAILED_REPLY.
> > >
> > > In addition to the above cases, different kinds of binder error codes
> > > might be returned to the sender. However, the core Linux, or Android,
> > > system administration process never knows what's actually happening.
> >
> > I don't think the previous two paragraphs provide anything meaninful
> > and the explanation below looks enough IMO. I would just drop the noise.
>
> That makes sense. I'll remove them. Thanks!
>
> >
> > >
> > > Introduce generic netlink messages into the binder driver so that the
> > > Linux/Android system administration process can listen to important
> > > events and take corresponding actions, like stopping a broken app from
> > > attacking the OS by sending huge amount of spamming binder transactions.
> > >
> > > The new binder genl sources and headers are automatically generated from
> > > the corresponding binder_genl YAML spec. Don't modify them directly.
> >
> > I assume "genl" comes from "generic netlink". Did you think about using
> > just "netlink". IMO it provides better context about what this is about.
> >
>
> Yes, "genl" has been widely used in the Linux kernel. But I'm fine to rename
> it to just "netlink". I'll change it in v10 unless there's other opinions.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=genl
> https://man7.org/linux/man-pages/man8/genl.8.html

I am familiar with the "genl" e.g. as in genlmsghdr. However, it wasn't
immediately straight-forward to me the connection as it would have been
with "netlink".

I don't know what the convention is for this type of generic netlink
interfaces. Perhaps "genl" is more appropriate, I just know that using
"netlink" would have been more straight-forward (at least to me).

Maybe most others that are new to this binder interface will go through
the same "discovery" process I went through.

>
> > >
> > > Signed-off-by: Li Li <dualli@xxxxxxxxxx>
> > > ---
> > > Documentation/admin-guide/binder_genl.rst | 96 +++++++
> >
> > We already have a "binderfs" entry. Perhaps, we should just merge your
> > Documentation with that one and call it "binder" instead?
> > You might want to run this by Christian Brauner though.
> >
>
> I'm happy to merge it if the doc maintainers like this idea. Or we can do
> it in a separate patch later.
>
> > > +===========================================================
> > > +Generic Netlink for the Android Binder Driver (Binder Genl)
> > > +===========================================================
> > > +
> > > +The Generic Netlink subsystem in the Linux kernel provides a generic way for
> > > +the Linux kernel to communicate to the user space applications via binder
> >
> > nit: s/communicate to/communicate with/
>
> Would fix it. Thanks!
>
> >
> > > +driver. It is used to report various kinds of binder transactions to user
> > > +space administration process. The driver allows multiple binder devices and
> >
> > The transactions types that I'm familiar with are sync/async. I think
> > you want to say "report transaction errors" or something like that
> > instead?
> >
>
> Yes, it means the transaction error code. I'll make it clearer.
>
> > > +their corresponding binder contexts. Each context has an independent Generic
> > > +Netlink for security reason. To prevent untrusted user applications from
> > > +accessing the netlink data, the kernel driver uses unicast mode instead of
> > > +multicast.
> > > +
> > > +Basically, the user space code uses the "set" command to request what kind
> >
> > Can you use the actual command? e.g. BINDER_GENL_CMD_SET
> >
> > BTW, why set? what are we setting? Would *_REPORT_SETUP be more
> > appropriate?
> >
>
> Hmm, I intentionally make them short in the netlink YAML file.
> Otherwise the generated code/name is quite long. But if this is
> causing confusion, I'm happy to use a more descriptive (and longer)
> name.

Short is fine. However, what is "SET"? In the future we might add more
commands to this interface and I can see this name being a problem at
that point.

>
> > > +of binder transactions should be reported by the kernel binder driver. The
> > > +driver then echoes the attributes in a reply message to acknowledge the
> > > +request. The "set" command also registers the current user space process to
> > > +receive the reports. When the user space process exits, the previous request
> > > +will be reset to prevent any potential leaks.
> > > +
> > > +Currently the driver can report binder transactions that "failed" to reach
> > > +the target process, or that are "delayed" due to the target process being
> >
> > "Delayed" transaction is an entirely new concept. I suppose it means
> > async + frozen. Why not use that?
> >
> > Also, per this logic it seems that a "delayed" transaction could also be
> > "spam" correct? e.g. the flags are not mutually exclusive.
>
> It depends on the actual implementation. Currently each binder
> transaction only returns one single error code. A "spam" one also
> indicates it's a "delayed" one.

I'm absolutely confused as to what is "delayed" then? Isn't it async &&
frozen?

A spam transaction can be if the caller has >=25% of the target's buffer
capacity or over 50 transactions, regardless of whether the target is
frozen or not.

It seems to me these are two independent scenarios and a transaction can
be either one or both.

>
> >
> > > +frozen by cgroup freezer, or that are considered "spam" according to existing
> > > +logic in binder_alloc.c.
> > > +
> > > +When the specified binder transactions happen, the driver uses the "report"
> > > +command to send a generic netlink message to the registered process,
> > > +containing the payload struct binder_report.
> > > +
> > > +More details about the flags, attributes and operations can be found at the
> > > +the doc sections in Documentations/netlink/specs/binder_genl.yaml and the
> > > +kernel-doc comments of the new source code in binder.{h|c}.
> > > +
> > > +Using Binder Genl
> > > +-----------------
> > > +
> > > +The Binder Genl can be used in the same way as any other generic netlink
> > > +drivers. Userspace application uses a raw netlink socket to send commands
> > > +to and receive packets from the kernel driver.
> > > +
> > > +.. note::
> > > + If the userspace application that talks to the driver exits, the kernel
> > > + driver will automatically reset the configuration to the default and
> > > + stop sending more reports to prevent leaking memory.
> >
> > I'm not sure what you mean by preventing memory leaks. What happens when
> > userspace setups the report and doesn't call "recv()"? Is that what we
> > are worried about?
> >
>
> Probably "leaking memory" isn't accurate here. If the user app
> doesn't call recv(), the netlink message would just fail to send.
> There's no memleak. But I think it's a good idea to reset the
> configuration. Let me describe it in a better way.
>
> > > +
> > > +Usage example (user space pseudo code):
> > > +
> > > +::
> > > +
> > > + // open netlink socket
> > > + int fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
> > > +
> > > + // bind netlink socket
> > > + bind(fd, struct socketaddr);
> > > +
> > > + // get the family id of the binder genl
> > > + send(fd, CTRL_CMD_GETFAMILY, CTRL_ATTR_FAMILY_NAME,
> > > + BINDER_GENL_FAMILY_NAME);
> >
> > ok, what is happening here? this is not a regular send(). Is this
> > somehow an overloaded send()? If so, I had a really hard time trying to
> > figuring that out so might be best to rename this.
> >
>
> This pseudo code means a few attributes are sent by a single send().
>
> > > + if (flags != (flags & (BINDER_GENL_FLAG_OVERRIDE
> > > + | BINDER_GENL_FLAG_FAILED
> > > + | BINDER_GENL_FLAG_DELAYED
> > > + | BINDER_GENL_FLAG_SPAM))) {
> > > + pr_err("Invalid binder report flags: %u\n", flags);
> > > + return -EINVAL;
> > > + }
> >
> > didn't Jakub mentioned this part wasn't needed?
> >
>
> Good catch! I removed them but somehow didn't commit the change.
>
>
> > > +int binder_genl_nl_set_doit(struct sk_buff *skb, struct genl_info *info)
> > > +{
> > > + int portid;
> > > + u32 pid;
> > > + u32 flags;
> > > + void *hdr;
> > > + struct binder_device *device;
> > > + struct binder_context *context = NULL;
> >
> > nit: would you mind using reverse christmas tree for this variables
> > and also in other functions too?
> >
>
> Sure.
>
> > > +
> > > + hlist_for_each_entry(device, &binder_devices, hlist) {
> > > + if (!nla_strcmp(info->attrs[BINDER_GENL_A_CMD_CONTEXT],
> > > + device->context.name)) {
> > > + context = &device->context;
> > > + break;
> > > + }
> > > + }
> > > +
> > > + if (!context) {
> > > + NL_SET_ERR_MSG(info->extack, "Unknown binder context\n");
> > > + return -EINVAL;
> > > + }
> > > +
> > > + portid = nlmsg_hdr(skb)->nlmsg_pid;
> > > + pid = nla_get_u32(info->attrs[BINDER_GENL_A_CMD_PID]);
> > > + flags = nla_get_u32(info->attrs[BINDER_GENL_A_CMD_FLAGS]);
> > > +
> > > + if (context->report_portid && context->report_portid != portid) {
> > > + NL_SET_ERR_MSG_FMT(info->extack,
> > > + "No permission to set flags from %d\n",
> > > + portid);
> > > + return -EPERM;
> > > + }
> > > +
> > > + if (binder_genl_set_report(context, pid, flags) < 0) {
> > > + pr_err("Failed to set report flags %u for %u\n", flags, pid);
> > > + return -EINVAL;
> > > + }
> >
> > With the flags check being unnecessary you probably want to fold
> > binder_genl_set_report() here instead.
> >
>
> Sorry, I don't quite understand this request. Can you please explain it?

I mean that without the flags check binder_genl_set_report() is small
enough and the only caller is this. So you could just delete that
function and write the code here in binder_genl_nl_set_doit().

>
> > > +/**
> > > + * Add a binder device to binder_devices
> > > + * @device: the new binder device to add to the global list
> > > + *
> > > + * Not reentrant as the list is not protected by any locks
> > > + */
> > > +void binder_add_device(struct binder_device *device)
> > > +{
> > > + hlist_add_head(&device->hlist, &binder_devices);
> > > +}
> >
> > nit: would you mind separating the binder_add_device() logic into a
> > separate "prep" commit?
> >
>
> Sure.
>
> > > +
> > > static int __init init_binder_device(const char *name)
> > > {
> > > int ret;
> > > @@ -6953,6 +7217,7 @@ static int __init init_binder_device(const char *name)
> > > }
> > >
> > > hlist_add_head(&binder_device->hlist, &binder_devices);
> > > + binder_device->context.report_seq = (atomic_t)ATOMIC_INIT(0);
> >
> > I don't think this is meant to be used like this.
> >
> > Also, binder_device is kzalloc'ed so no need to init report_seq at all.
> >
>
> I'll remove this unnecessary code.
>
> > >
> >
> >
> > Also, how is userspace going to determine that this new interface is
> > available? Do we need a new entry under binder features? Or is this not
> > a problem?
>
> It's not a problem. The generic netlink command "getfamily" will fail.

Cool, and this wouldn't be retried after it has failed right?