Re: [PATCH] nfc: pn533: Fix null-ptr-deref in pn533_recv_frame()

From: Krzysztof Kozlowski
Date: Wed Apr 24 2024 - 01:37:19 EST


On 22/04/2024 10:04, Yuxuan Hu wrote:
> From: Yuxuan Hu <yuxuanhu@xxxxxxxxxxx>
>
> Our fuzzing tool found a null-ptr-deref in function pn533_recv_frame
> (/drivers/nfc/pn533/pn533.c) in kernel 6.8.
>
> (1.1) When execute the NFC_CMD_START_POLL command via netlink, the
> pn533_send_cmd_async function (/drivers/nfc/pn533/pn533.c: 1714) is
> called, which sends the PN533_CMD_IN_AUTOPOLL command packet.
>
> (2.1) If a pn533 response frame that does not match the command is
> received, the following call sequence is executed:
> pn533_recv_frame (/drivers/nfc/pn533/pn533.c: 2165)
> pn533_rx_frame_is_cmd_response (/drivers/nfc/pn533/pn533.c: 2194)
> pn533_wq_cmd_complete (/drivers/nfc/pn533/pn533.c: 2022)
> pn533_send_async_complete (/drivers/nfc/pn533/pn533.c: 547)
> pn533_autopoll_complete (/drivers/nfc/pn533/pn533.c: 414)
>
> (2.2) After completing (2.1), dev->cmd is freed and set to null
> (/drivers/nfc/pn533/pn533.c: 432-433).
>
> (3.1) If another incorrect pn533 response frame is received during
> the above process, (2.1) and (2.2) will be executed concurrently, and
> the initial process setting dev->cmd to null causes the concurrent
> process to trigger a null-ptr-deref in pn533_recv_frame.
>
> Although pn533_recv_frame checks for dev->cmd at the beginning, it is
> possible that dev->cmd is set to null after the check.

That sounds reasonable... but solution does not.

>
> Through our verification, this concurrent vulnerability has a high
> probability of occurrence and needs to be fixed.
>
> Kernel print messages when null-ptr-deref is triggered (including PN533
> packets, PN533 module errors, and KASAN reports) are as follows.
> We added printk of the data packets, and printk before the relevant steps
> in pn533_send_async_complete and pn533_recv_frame.
>
> -> 00 00 FF 08 F8 D4 60 FF 03 00 11 12 04 A3 00
> <- 00 00 FF 00 FF 00
> <- 00 00 FF 0E F2 D5 86 01 10 09 01 00 20 08 04 9B 2C EE 9F 0A 00
> tty tty60: NFC: It it not the response to the last command
> arted polling nfc device
> <- 00 00 FF 03 FD D5 41 00 EA 00
> tty tty60: NFC: pn533_autopoll_complete autopoll complete error -5
> tty tty60: NFC: It it not the response to the last command
> tty tty60: NFC: Error -5 when running autopoll
> tty tty60: NFC: autopoll operation has been stopped
> pn533_send_async_complete: set dev->cmd to null!!!
> pn533_recv_frame: dev->cmd is null!!!
> BUG: kernel NULL pointer dereference, address: 0000000000000014
> #PF: supervisor write access in kernel mode
> #PF: error_code(0x0002) - not-present page
> PGD 0 P4D 0
> Oops: 0002 [#1] PREEMPT SMP KASAN PTI
> CPU: 0 PID: 5541 Comm: kworker/0:0 Tainted: G O 6.8.0 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> Workqueue: events nfcvirt_recv_work [nfcvirt]
> RIP: 0010:pn533_recv_frame+0x18a/0x1e0 [pn533]
> Code: 43 ff ff ff 48 8b bb 80 02 00 00 48 c7 c6 0b 02 46 c0 31 c0 e8 97 64 4f c4 48 83 bb b0 01 00 00 00 74 3f 48 8b 83 b0 01 00 00 <c7> 40 14 fb ff ff ff 48 8b 83 b0 01 00 00 48 85 c0 0f 85 3b ff ff
> RSP: 0018:ffff88802665fc68 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff88804a13b800 RCX: ad381de3b3cd5e00
> RDX: 1ffff11004ccbf38 RSI: 0000000000000008 RDI: ffff88802665f9e0
> RBP: ffff88804fb25000 R08: ffff88802665f9e7 R09: 1ffff11004ccbf3c
> R10: dffffc0000000000 R11: ffffed1004ccbf3d R12: 0000000000001950
> R13: ffff88804ab80000 R14: ffff888021d22640 R15: ffff88802665fcb0
> FS: 0000000000000000(0000) GS:ffff88806d200000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000014 CR3: 0000000004ebc000 CR4: 00000000000006f0

Please trim the log from irrelevant register dumps.

> Call Trace:
> <TASK>
> ? __die_body+0x62/0xb0
> ? page_fault_oops+0x421/0x740
> ? kernelmode_fixup_or_oops+0x1d0/0x1d0
> ? asan.module_ctor+0x10/0x10
> ? vprintk_emit+0x3f0/0x3f0
> ? kernelmode_fixup_or_oops+0x163/0x1d0
> ? do_user_addr_fault+0xb6c/0xde0
> ? irq_work_queue+0x54/0xa0
> ? do_kern_addr_fault+0x160/0x160
> ? __call_rcu_common+0x518/0xc30
> ? _dev_err+0x106/0x150
> ? exc_page_fault+0x66/0x1a0
> ? asm_exc_page_fault+0x22/0x30
> ? pn533_recv_frame+0x18a/0x1e0 [pn533]
> ? pn533_recv_frame+0x1d7/0x1e0 [pn533]
> nfcvirt_recv_work+0x24e/0x320 [nfcvirt]
> ? wake_bit_function+0x230/0x230
> process_one_work+0x4f0/0xab0
> worker_thread+0x8af/0xee0
> ? process_one_work+0xab0/0xab0
> kthread+0x275/0x300
> ? process_one_work+0xab0/0xab0
> ? kthread_blkcg+0xa0/0xa0
> ret_from_fork+0x30/0x60
> ? kthread_blkcg+0xa0/0xa0
> ret_from_fork_asm+0x11/0x20
> </TASK>
> Modules linked in: nfcvirt(O) pn533(O) nfc(O) ki_coverage(O) [last unloaded: pn533(O)]
> CR2: 0000000000000014
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:pn533_recv_frame+0x18a/0x1e0 [pn533]
> Code: 43 ff ff ff 48 8b bb 80 02 00 00 48 c7 c6 0b 02 46 c0 31 c0 e8 97 64 4f c4 48 83 bb b0 01 00 00 00 74 3f 48 8b 83 b0 01 00 00 <c7> 40 14 fb ff ff ff 48 8b 83 b0 01 00 00 48 85 c0 0f 85 3b ff ff
> RSP: 0018:ffff88802665fc68 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff88804a13b800 RCX: ad381de3b3cd5e00
> RDX: 1ffff11004ccbf38 RSI: 0000000000000008 RDI: ffff88802665f9e0
> RBP: ffff88804fb25000 R08: ffff88802665f9e7 R09: 1ffff11004ccbf3c
> R10: dffffc0000000000 R11: ffffed1004ccbf3d R12: 0000000000001950
> R13: ffff88804ab80000 R14: ffff888021d22640 R15: ffff88802665fcb0
> FS: 0000000000000000(0000) GS:ffff88806d200000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000014 CR3: 0000000004ebc000 CR4: 00000000000006f0

Same here.

This just makes the commit log less readable.

>
> Signed-off-by: Yuxuan Hu <yuxuanhu@xxxxxxxxxxx>
> ---
> drivers/nfc/pn533/pn533.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/nfc/pn533/pn533.c b/drivers/nfc/pn533/pn533.c
> index b19c39dcfbd9..a80a23332f59 100644
> --- a/drivers/nfc/pn533/pn533.c
> +++ b/drivers/nfc/pn533/pn533.c
> @@ -2190,9 +2190,13 @@ void pn533_recv_frame(struct pn533 *dev, struct sk_buff *skb, int status)
>
> if (!dev->ops->rx_is_frame_valid(skb->data, dev)) {
> nfc_err(dev->dev, "Received an invalid frame\n");

Imagine here dev->cmd != NULL...

> + if (!dev->cmd)
> + goto sched_wq;

.. but here is being NULL-ified by pn533_send_async_complete(). How
does your solution prevent anything? I assume pn533_recv_frame() will be
executed in parallel to the workqueue.

A bit better solution would be to NULL-ify dev->cmd at the beginning of
pn533_send_async_complete(), because that seems logical. The complete
callback takes ownership of dev->cmd, so why it performs the assignment
at the end?

However even above code will keep the race open for short period.
Probably some locking would solve it or checking for dev->cmd in few
places with barriers.

Best regards,
Krzysztof