Re: INFO: rcu detected stall in xfrm_confirm_neigh

From: Steffen Klassert
Date: Tue Feb 20 2018 - 04:53:33 EST


On Mon, Feb 19, 2018 at 11:05:38AM +0100, Dmitry Vyukov wrote:
> On Mon, Feb 19, 2018 at 8:22 AM, Steffen Klassert
> <steffen.klassert@xxxxxxxxxxx> wrote:
> >> > <syzbot+7d03c810e50aaedef98a@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> >> >> Hello,
> >> >>
> >> >> syzbot hit the following crash on net-next commit
> >> >> 9515a2e082f91457db0ecff4b65371d0fb5d9aad (Thu Jan 25 03:37:38 2018 +0000)
> >> >> net/ipv4: Allow send to local broadcast from a socket bound to a VRF
> >> >>
> >> >> So far this crash happened 6 times on net-next.
> >> >> Unfortunately, I don't have any reproducer for this crash yet.
> >> >> Raw console output is attached.
> >> >> compiler: gcc (GCC) 7.1.1 20170620
> >> >> .config is attached.
> >> >
> >> >
> >> > +xfrm maintainers
> >>
> >> Here is a C repro:
> >> https://gist.githubusercontent.com/dvyukov/92c67ba9afaaa960bcfbdc6ef549ac10/raw/786f9221c1d707c7f4a15effcb1d5997dd4f8638/gistfile1.txt
> >
> > Seems like syzbot does not know about this reproducer.
> >
> > I've send a patch to test and got this as the reply:
> >
> > This crash does not have a reproducer. I cannot test it.
>
> Yes, it does not know about the reproducer. I've extracted it
> manually, these hangs are sometimes hard to reproduce. For syzbot this
> bug does not have a reproducer.
> Have you tried to run the reproducer? For me it reproduced the bug
> quite reliably.

I've tried the reproducer, it does not trigger with my kernel
config and with your config my machine does not boot. So it
was just easy to ask syzbot for a test.

Anyway, Florian Westphal did a test and confirmed that
the patch fixes the issue. I've just applied the fix
below to the ipsec tree.


Subject: [PATCH] xfrm: Fix infinite loop in xfrm_get_dst_nexthop with transport mode.

On transport mode we forget to fetch the child dst_entry
before we continue the while loop, this leads to an infinite
loop. Fix this by fetching the child dst_entry before we
continue the while loop.

Fixes: 0f6c480f23f4 ("xfrm: Move dst->path into struct xfrm_dst")
Reported-by: syzbot+7d03c810e50aaedef98a@xxxxxxxxxxxxxxxxxxxxxxxxx
Tested-by: Florian Westphal <fw@xxxxxxxxx>
Signed-off-by: Steffen Klassert <steffen.klassert@xxxxxxxxxxx>
---
net/xfrm/xfrm_policy.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 150d46633ce6..625b3fca5704 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -2732,14 +2732,14 @@ static const void *xfrm_get_dst_nexthop(const struct dst_entry *dst,
while (dst->xfrm) {
const struct xfrm_state *xfrm = dst->xfrm;

+ dst = xfrm_dst_child(dst);
+
if (xfrm->props.mode == XFRM_MODE_TRANSPORT)
continue;
if (xfrm->type->flags & XFRM_TYPE_REMOTE_COADDR)
daddr = xfrm->coaddr;
else if (!(xfrm->type->flags & XFRM_TYPE_LOCAL_COADDR))
daddr = &xfrm->id.daddr;
-
- dst = xfrm_dst_child(dst);
}
return daddr;
}
--
2.14.1