Re: INFO: rcu detected stall in kfree_skbmem

From: Marcelo Ricardo Leitner
Date: Sun May 13 2018 - 12:02:51 EST


On Sun, May 13, 2018 at 03:52:01PM +0200, Dmitry Vyukov wrote:
> On Fri, May 11, 2018 at 10:42 PM, Marcelo Ricardo Leitner
> <marcelo.leitner@xxxxxxxxx> wrote:
> > On Fri, May 11, 2018 at 12:08:33PM -0700, Eric Dumazet wrote:
> >>
> >>
> >> On 05/11/2018 11:41 AM, Marcelo Ricardo Leitner wrote:
> >>
> >> > But calling ip6_xmit with rcu_read_lock is expected. tcp stack also
> >> > does it.
> >> > Thus I think this is more of an issue with IPv6 stack. If a host has
> >> > an extensive ip6tables ruleset, it probably generates this more
> >> > easily.
> >> >
> >> >>> sctp_v6_xmit+0x4a5/0x6b0 net/sctp/ipv6.c:225
> >> >>> sctp_packet_transmit+0x26f6/0x3ba0 net/sctp/output.c:650
> >> >>> sctp_outq_flush+0x1373/0x4370 net/sctp/outqueue.c:1197
> >> >>> sctp_outq_uncork+0x6a/0x80 net/sctp/outqueue.c:776
> >> >>> sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1820 [inline]
> >> >>> sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
> >> >>> sctp_do_sm+0x596/0x7160 net/sctp/sm_sideeffect.c:1191
> >> >>> sctp_generate_heartbeat_event+0x218/0x450 net/sctp/sm_sideeffect.c:406
> >> >>> call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
> >> >>> expire_timers kernel/time/timer.c:1363 [inline]
> >> >
> >> > Having this call from a timer means it wasn't processing sctp stack
> >> > for too long.
> >> >
> >>
> >> I feel the problem is that this part is looping, in some infinite loop.
> >>
> >> I have seen this stack traces in other reports.
> >
> > Checked mail history now, seems at least two other reports on RCU
> > stalls had sctp_generate_heartbeat_event involved.
> >
> >>
> >> Maybe some kind of list corruption.
> >
> > Could be.
> > Do we know if it generated a flood of packets?
>
> We only know what's in the bug reports. Do the other ones have

Ok.

> reproducers? It can make sense to mark them as duplicates to not have

No.

> a placer of open bugs about the same root cause.

They may have the same root cause, but right now I cannot tell for
sure.