Re: kernel BUG at net/phonet/socket.c:LINE!
From: Rémi Denis-Courmont
Date: Sat Apr 11 2020 - 03:37:06 EST
Hi,
Le lauantaina 11. huhtikuuta 2020, 3.43.20 EEST Vito Caputo a Ãcrit :
> I stared a bit at the code surrounding this report, and maybe someone more
> familiar with the network stack can clear something up for me real quick:
> > RIP: 0010:pn_socket_autobind net/phonet/socket.c:213 [inline]
>
> net/phonet/socket.c:
> 202 static int pn_socket_autobind(struct socket *sock)
> 203 {
> 204 struct sockaddr_pn sa;
> 205 int err;
> 206
> 207 memset(&sa, 0, sizeof(sa));
> 208 sa.spn_family = AF_PHONET;
> 209 err = pn_socket_bind(sock, (struct sockaddr *)&sa,
> 210 sizeof(struct sockaddr_pn));
> 211 if (err != -EINVAL)
> 212 return err;
> 213 BUG_ON(!pn_port(pn_sk(sock->sk)->sobject));
> 214 return 0; /* socket was already bound */
> 215 }
>
> line 213 is the BUG_ON for !sock->sk->sobject, a phonet-specific member:
>
> include/net/phonet/phonet.h:
> 23 struct pn_sock {
> 24 struct sock sk;
> 25 u16 sobject;
> 26 u16 dobject;
> 27 u8 resource;
> 28 };
>
> pn_socket_autobind() expects that to be non-null whenever pn_socket_bind()
> returns -EINVAL, which seems odd, but the comment claims it's already bound,
> let's look at pn_socket_bind():
>
> net/phonet/socket.c:
> 156 static int pn_socket_bind(struct socket *sock, struct sockaddr *addr,
> int len) 157 {
> 158 struct sock *sk = sock->sk;
> 159 struct pn_sock *pn = pn_sk(sk);
> 160 struct sockaddr_pn *spn = (struct sockaddr_pn *)addr;
> 161 int err;
> 162 u16 handle;
> 163 u8 saddr;
> 164
> 165 if (sk->sk_prot->bind)
> 166 return sk->sk_prot->bind(sk, addr, len);
> 167
> 168 if (len < sizeof(struct sockaddr_pn))
> 169 return -EINVAL;
> 170 if (spn->spn_family != AF_PHONET)
> 171 return -EAFNOSUPPORT;
> 172
> 173 handle = pn_sockaddr_get_object((struct sockaddr_pn *)addr);
> 174 saddr = pn_addr(handle);
> 175 if (saddr && phonet_address_lookup(sock_net(sk), saddr))
> 176 return -EADDRNOTAVAIL;
> 177
> 178 lock_sock(sk);
> 179 if (sk->sk_state != TCP_CLOSE || pn_port(pn->sobject)) {
> 180 err = -EINVAL; /* attempt to rebind */
> 181 goto out;
> 182 }
> 183 WARN_ON(sk_hashed(sk));
> 184 mutex_lock(&port_mutex);
> 185 err = sk->sk_prot->get_port(sk, pn_port(handle));
> 186 if (err)
> 187 goto out_port;
> 188
> 189 /* get_port() sets the port, bind() sets the address if
> applicable */ 190 pn->sobject = pn_object(saddr,
> pn_port(pn->sobject));
> 191 pn->resource = spn->spn_resource;
> 192
> 193 /* Enable RX on the socket */
> 194 err = sk->sk_prot->hash(sk);
> 195 out_port:
> 196 mutex_unlock(&port_mutex);
> 197 out:
> 198 release_sock(sk);
> 199 return err;
> 200 }
>
>
> The first return branch in there simply hands off the bind to and
> indirection sk->sk_prot->bind() if present. This smells ripe for breaking
> the assumptions of that BUG_ON().
I believe that this is in line with the design of the socket stack within the
Linux kernel. 'struct proto_ops' carries the protocol family operations, then
'struct proto' carries the protocol operations.
Admittedly, Phonet only had one datagram and one stream protocol ever written,
as the hardware development ceased. So in practice, there is a 1:1 mapping
between the two, and sk_prot.bind is always NULL.
> I'm assuming there's no point for such an indirection if not to enable a
> potentially non-phonet-ops hook, otherwise we'd just be do the bind.
In my understanding, that's *not* what sk_prot is for, no. It's rather meant
to specialize the socket calls on a per-protocol basis.
For instance, UDP and UDP-Lite share their address family 'struct proto_ops'
(either inet_dgram_ops or inet6_dgram_ops) but they have different
'struct proto'.
> If not, isn't this plain recursive? Color me confused. Will other bind()
> implementations return -EINVAL when already bound with sobject set? no.
As far as I can find, there are no cases where the bind pointer would not be
NULL in upstream kernel. This can only happen if an out-of-tree module defines
its own protocol within the Phonet family.
> Furthermore, -EINVAL is also returned when len < sizeof(struct sockaddr_pn),
> maybe the rebind case should return -EADDRINUSE instead of -EINVAL?
bind() semantics require returning EINVAL if the socket address size is, well,
invalid.
If we are to distinguish the two error scenarii, then it's the rebind case
that needs a different error, but EINVAL is consistent with INET.
> I must be missing some things.
--
RÃmi Denis-Courmont
Tapiolan uusi kaupunki, Uudenmaan tasavalta