Re: [PATCH bpf-next] xsk: proper socket state check in xsk_poll

From: Daniel Borkmann
Date: Tue Aug 20 2019 - 17:24:33 EST

On 8/20/19 5:29 PM, BjÃrn TÃpel wrote:
On Tue, 20 Aug 2019 at 16:30, Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
On 8/20/19 12:04 PM, BjÃrn TÃpel wrote:
From: BjÃrn TÃpel <bjorn.topel@xxxxxxxxx>

The poll() implementation for AF_XDP sockets did not perform the
proper state checks, prior accessing the socket umem. This patch fixes
that by performing a xsk_is_bound() check.

Suggested-by: Hillf Danton <hdanton@xxxxxxxx>
Reported-by: syzbot+c82697e3043781e08802@xxxxxxxxxxxxxxxxxxxxxxxxx
Fixes: 77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP rings")
Signed-off-by: BjÃrn TÃpel <bjorn.topel@xxxxxxxxx>
net/xdp/xsk.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index ee4428a892fa..08bed5e92af4 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -356,13 +356,20 @@ static int xsk_generic_xmit(struct sock *sk, struct msghdr *m,
return err;

+static bool xsk_is_bound(struct xdp_sock *xs)
+ struct net_device *dev = READ_ONCE(xs->dev);
+ return dev && xs->state == XSK_BOUND;
static int xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
bool need_wait = !(m->msg_flags & MSG_DONTWAIT);
struct sock *sk = sock->sk;
struct xdp_sock *xs = xdp_sk(sk);

- if (unlikely(!xs->dev))
+ if (unlikely(!xsk_is_bound(xs)))
return -ENXIO;
if (unlikely(!(xs->dev->flags & IFF_UP)))
return -ENETDOWN;
@@ -383,6 +390,9 @@ static unsigned int xsk_poll(struct file *file, struct socket *sock,
struct net_device *dev = xs->dev;
struct xdp_umem *umem = xs->umem;

+ if (unlikely(!xsk_is_bound(xs)))
+ return mask;
if (umem->need_wakeup)
dev->netdev_ops->ndo_xsk_wakeup(dev, xs->queue_id,
@@ -417,7 +427,7 @@ static void xsk_unbind_dev(struct xdp_sock *xs)
struct net_device *dev = xs->dev;

- if (!dev || xs->state != XSK_BOUND)
+ if (!xsk_is_bound(xs))

I think I'm a bit confused by your READ_ONCE() usage. ;-/ I can see why you're
using it in xsk_is_bound() above, but then at the same time all the other callbacks
like xsk_poll() or xsk_unbind_dev() above have a struct net_device *dev = xs->dev
right before the test. Could you elaborate?

Yes, now I'm confused as well! Digging deeper... I believe there are a
couple of places in xsk.c that do not have
READ_ONCE/WRITE_ONCE-correctness. Various xdp_sock members are read
lock-less outside the control plane mutex (mutex member of struct
xdp_sock). This needs some re-work. I'll look into using the newly

Right, so even in above two cases, the compiler could have refetched, e.g.
dev variable could have first been NULL, but xsk_is_bound() later returns

introduced state member (with corresponding read/write barriers) for

I'll cook some patch(es) that address this, but first it sounds like I
need to reread [1] two, or three times. At least. ;-)