Re: selinux_netlink_send changes program behavior

From: Dmitry Vyukov
Date: Sat Apr 25 2020 - 08:01:12 EST


On Sat, Apr 25, 2020 at 1:42 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
> >> On Fri, Apr 24, 2020 at 4:27 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> >>> Hi SELinux maintainers,
> >>>
> >>> We've hit a case where a developer wasn't able to reproduce a kernel
> >>> bug, it turned out to be a difference in behavior between SELinux and
> >>> non-SELinux kernels.
> >>> Condensed version: a program does sendmmsg on netlink socket with 2
> >>> mmsghdr's, first is completely empty/zeros, second contains some
> >>> actual payload. Without SELinux the first mmsghdr is treated as no-op
> >>> and the kernel processes the second one (triggers bug). However the
> >>> SELinux hook does:
> >>>
> >>> static int selinux_netlink_send(struct sock *sk, struct sk_buff *skb)
> >>> {
> >>> if (skb->len < NLMSG_HDRLEN) {
> >>> err = -EINVAL;
> >>> goto out;
> >>> }
> >>>
> >>> and fails processing on the first empty mmsghdr (does not happen
> >>> without SELinux).
> >>>
> >>> Is this difference in behavior intentional/acceptable/should be fixed?
> >>
> >> From a practical perspective, SELinux is always going to need to do a
> >> length check as it needs to peek into the netlink message header for
> >> the message type so it can map that to the associated SELinux
> >> permissions. So in that sense, the behavior is intentional and
> >> desired; however from a bug-for-bug compatibility perspective ... not
> >> so much.
> >>
> >> Ultimately, my it's-Friday-and-it's-been-a-long-week-ending-in-a-long-day
> >> thought is that this was a buggy operation to begin with and the bug
> >> was just caught in different parts of the kernel, depending on how it
> >> was configured. It may not be ideal, but I can think of worse things
> >> (and arguably SELinux is doing the Right Thing).
> >
> > +netlink maintainers for intended semantics of empty netlink messages
> >
> > If it's a bug, or intended behavior depends on the intended
> > behavior... which I assume is not documented anywhere officially.
>
> Your original email gave the impression that there was a big in the non-SELinux case; if that is not the case my response changes.


There is no bug... Well, there is a crash, but it is somewhere in the
routing subsystem and is caused by the contents of the second netlink
message. This is totally unrelated to this SELinux check and that
crash is totally reproducible with SELinux as well if we just don't
send the first empty message.
The crux is really a difference in behavior in SELinux and non-SELinux cases.



> > However, most of the netlink families use netlink_rcv_skb, which does:
> >
> > int netlink_rcv_skb(struct sk_buff *skb, int (*cb)(struct sk_buff *,
> > struct nlmsghdr *,
> > struct netlink_ext_ack *))
> > {
> > ...
> > while (skb->len >= nlmsg_total_size(0)) {
> > ...
> > skb_pull(skb, msglen);
> > }
> > return 0;
> > }
> >
> > 1. How intentional is this while loop logic vs sloppy error checking?
> > 2. netlink_rcv_skb seems to be able to handle 2+ messages in the same
> > skb, while selinux_netlink_send only checks the first one... so can I
> > skip SELinux checks by putting a malicious message after a permitted
> > one?..
>
>
>