Re: [PATCH 1/7] nfc: llcp: fix NULL error pointer dereference on sendmsg() after failed bind()

From: Krzysztof Kozlowski
Date: Sat Jan 15 2022 - 07:31:33 EST


On 15/01/2022 13:26, Krzysztof Kozlowski wrote:
> Syzbot detected a NULL pointer dereference of nfc_llcp_sock->dev pointer
> (which is a 'struct nfc_dev *') with calls to llcp_sock_sendmsg() after
> a failed llcp_sock_bind(). The message being sent is a SOCK_DGRAM.
>
> KASAN report:
>
> BUG: KASAN: null-ptr-deref in nfc_alloc_send_skb+0x2d/0xc0
> Read of size 4 at addr 00000000000005c8 by task llcp_sock_nfc_a/899
>
> CPU: 5 PID: 899 Comm: llcp_sock_nfc_a Not tainted 5.16.0-rc6-next-20211224-00001-gc6437fbf18b0 #125
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
> Call Trace:
> <TASK>
> dump_stack_lvl+0x45/0x59
> ? nfc_alloc_send_skb+0x2d/0xc0
> __kasan_report.cold+0x117/0x11c
> ? mark_lock+0x480/0x4f0
> ? nfc_alloc_send_skb+0x2d/0xc0
> kasan_report+0x38/0x50
> nfc_alloc_send_skb+0x2d/0xc0
> nfc_llcp_send_ui_frame+0x18c/0x2a0
> ? nfc_llcp_send_i_frame+0x230/0x230
> ? __local_bh_enable_ip+0x86/0xe0
> ? llcp_sock_connect+0x470/0x470
> ? llcp_sock_connect+0x470/0x470
> sock_sendmsg+0x8e/0xa0
> ____sys_sendmsg+0x253/0x3f0
> ...
>
> The issue was visible only with multiple simultaneous calls to bind() and
> sendmsg(), which resulted in most of the bind() calls to fail. The
> bind() was failing on checking if there is available WKS/SDP/SAP
> (respective bit in 'struct nfc_llcp_local' fields). When there was no
> available WKS/SDP/SAP, the bind returned error but the sendmsg() to such
> socket was able to trigger mentioned NULL pointer dereference of
> nfc_llcp_sock->dev.
>
> The code looks simply racy and currently it protects several paths
> against race with checks for (!nfc_llcp_sock->local) which is NULL-ified
> in error paths of bind(). The llcp_sock_sendmsg() did not have such
> check but called function nfc_llcp_send_ui_frame() had, although not
> protected with lock_sock().
>
> Therefore the race could look like (same socket is used all the time):
> CPU0 CPU1
> ==== ====
> llcp_sock_bind()
> - lock_sock()
> - success
> - release_sock()
> - return 0
> llcp_sock_sendmsg()
> - lock_sock()
> - release_sock()
> llcp_sock_bind(), same socket
> - lock_sock()
> - error
> - nfc_llcp_send_ui_frame()
> - if (!llcp_sock->local)
> - llcp_sock->local = NULL
> - nfc_put_device(dev)
> - dereference llcp_sock->dev
> - release_sock()
> - return -ERRNO
>
> The nfc_llcp_send_ui_frame() checked llcp_sock->local outside of the
> lock, which is racy and ineffective check. Instead, its caller
> llcp_sock_sendmsg(), should perform the check inside lock_sock().
>
> Reported-by: syzbot+7f23bcddf626e0593a39@xxxxxxxxxxxxxxxxxxxxxxxxx

Syzbot confirmed fix, so this could be replaced with:

Reported-and-tested-by: syzbot+7f23bcddf626e0593a39@xxxxxxxxxxxxxxxxxxxxxxxxx


Best regards,
Krzysztof