Re: INFO: rcu detected stall in llcp_sock_sendmsg

From: Petr Mladek
Date: Tue Jul 10 2018 - 06:56:01 EST


On Mon 2018-07-09 14:05:08, Eric Dumazet wrote:
>
>
> On 07/09/2018 01:50 PM, Dmitry Vyukov wrote:
> > On Mon, Jul 9, 2018 at 10:34 PM, syzbot
> > <syzbot+e9f364d3b15ce41d8451@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> >> Hello,
> >>
> >> syzbot found the following crash on:
> >>
> >> HEAD commit: 1e4b044d2251 Linux 4.18-rc4
> >> git tree: upstream
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=1414c2c2400000
> >> kernel config: https://syzkaller.appspot.com/x/.config?x=25856fac4e580aa7
> >> dashboard link: https://syzkaller.appspot.com/bug?extid=e9f364d3b15ce41d8451
> >> compiler: gcc (GCC) 8.0.1 20180413 (experimental)
> >>
> >> Unfortunately, I don't have any reproducer for this crash yet.
> >>
> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >> Reported-by: syzbot+e9f364d3b15ce41d8451@xxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > Looks like the problem is actually in nfc, so +nfc maintainers.
>
> Note this issue was discussed before, maybe we should patch NFC without waiting for nfc maintainer.

Do you have any particular solution in mind, please? See below.

> ----------------------------------------------------
>
> On 06/25/2018 10:12 PM, Sergey Senozhatsky wrote:
> > On (06/26/18 07:07), Dmitry Vyukov wrote:
> > [..]
> >>> #include <net/nfc/nfc.h>
> >>> @@ -755,7 +756,8 @@ int nfc_llcp_send_ui_frame(struct nfc_llcp_sock *sock, u8 ssap, u8 dsap,
> >>> pdu = nfc_alloc_send_skb(sock->dev, &sock->sk, MSG_DONTWAIT,
> >>> frag_len + LLCP_HEADER_SIZE, &err);
> >>> if (pdu == NULL) {
> >>> - pr_err("Could not allocate PDU\n");
> >>> + pr_err_ratelimited("Could not allocate PDU\n");
> >>> + cond_resched();
> >>> continue;
> >>> }
> >>
> >>
> >> But this thread is still in an infinite (unkillable?) loop? If yes, we
> >> are waiting for the next syzbot report
> >
> > The loop is still infinite, correct, but we have a preemption point now.
> > Sure, net people can come with a much better solution, I'll be happy to
> > scratch my patch.
> >
>
> This can not be the right solution, think about current thread being real time,
> cond_resched() might be a nop.
>
> We should probably not loop at all, or not use MSG_DONTWAIT.

These two solutions look promising. But they both need to
get reviewed by someone familiar with the code.

On one hand, nfc_llcp_send_ui_frame() already returns some errors
before sending anything. But I am not sure how to deal with situation
when a fragment of the message has already been sent.

Best Regards,
Petr