Re: [syzbot] [net?] KASAN: slab-use-after-free Read in unix_stream_read_actor (2)

From: Shoaib Rao
Date: Fri Sep 06 2024 - 12:49:48 EST



On 9/6/2024 5:37 AM, Eric Dumazet wrote:
On Thu, Sep 5, 2024 at 10:48 PM Shoaib Rao <rao.shoaib@xxxxxxxxxx> wrote:

On 9/5/2024 1:35 PM, Kuniyuki Iwashima wrote:
From: Shoaib Rao <rao.shoaib@xxxxxxxxxx>
Date: Thu, 5 Sep 2024 13:15:18 -0700
On 9/5/2024 12:46 PM, Kuniyuki Iwashima wrote:
From: Shoaib Rao <rao.shoaib@xxxxxxxxxx>
Date: Thu, 5 Sep 2024 00:35:35 -0700
Hi All,

I am not able to reproduce the issue. I have run the C program at least
100 times in a loop. In the I do get an EFAULT, not sure if that is
intentional or not but no panic. Should I be doing something
differently? The kernel version I am using is
v6.11-rc6-70-gc763c4339688. Later I can try with the exact version.
The -EFAULT is the bug meaning that we were trying to read an consumed skb.

But the first bug is in recvfrom() that shouldn't be able to read OOB skb
without MSG_OOB, which doesn't clear unix_sk(sk)->oob_skb, and later
something bad happens.

socketpair(AF_UNIX, SOCK_STREAM, 0, [3, 4]) = 0
sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\333", iov_len=1}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_OOB|MSG_DONTWAIT) = 1
recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=NULL, msg_iovlen=0, msg_controllen=0, msg_flags=MSG_OOB}, MSG_OOB|MSG_WAITFORONE) = 1
sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\21", iov_len=1}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_OOB|MSG_NOSIGNAL|MSG_MORE) = 1
recvfrom(3, "\21", 125, MSG_DONTROUTE|MSG_TRUNC|MSG_DONTWAIT, NULL, NULL) = 1
recvmsg(3, {msg_namelen=0}, MSG_OOB|MSG_ERRQUEUE) = -1 EFAULT (Bad address)

I posted a fix officially:
https://urldefense.com/v3/__https://lore.kernel.org/netdev/20240905193240.17565-5-kuniyu@xxxxxxxxxx/__;!!ACWV5N9M2RV99hQ!IJeFvLdaXIRN2ABsMFVaKOEjI3oZb2kUr6ld6ZRJCPAVum4vuyyYwUP6_5ZH9mGZiJDn6vrbxBAOqYI$
Thanks that is great. Isn't EFAULT, normally indicative of an issue
with the user provided address of the buffer, not the kernel buffer.
Normally, it's used when copy_to_user() or copy_from_user() or
something similar failed.

But this time, if you turn KASAN off, you'll see the last recvmsg()
returns 1-byte garbage instead of -EFAULT, so actually KASAN worked
on your host, I guess.
No it did not work. As soon as KASAN detected read after free it should
have paniced as it did in the report and I have been running the
syzbot's C program in a continuous loop. I would like to reproduce the
issue before we can accept the fix -- If that is alright with you. I
will try your new test case later and report back. Thanks for the patch
though.
KASAN does not panic unless you request it.

Documentation/dev-tools/kasan.rst

KASAN is affected by the generic ``panic_on_warn`` command line parameter.
When it is enabled, KASAN panics the kernel after printing a bug report.

By default, KASAN prints a bug report only for the first invalid memory access.
With ``kasan_multi_shot``, KASAN prints a report on every invalid access. This
effectively disables ``panic_on_warn`` for KASAN reports.

Alternatively, independent of ``panic_on_warn``, the ``kasan.fault=`` boot
parameter can be used to control panic and reporting behaviour:

- ``kasan.fault=report``, ``=panic``, or ``=panic_on_write`` controls whether
to only print a KASAN report, panic the kernel, or panic the kernel on
invalid writes only (default: ``report``). The panic happens even if
``kasan_multi_shot`` is enabled. Note that when using asynchronous mode of
Hardware Tag-Based KASAN, ``kasan.fault=panic_on_write`` always panics on
asynchronously checked accesses (including reads).

Hi Eric,

Thanks for the update. I forgot to mention that I I did set /proc/sys/kernel/panic_on_warn to 1. I ran the program over night in two separate windows, there are no reports and no panic. I first try to reproduce the issue, because if I can not, how can I be sure that I have fixed that bug? I may find another issue and fix it but not the one that I was trying to. Please be assured that I am not done, I continue to investigate the issue.

If someone has a way of reproducing the failure please kindly let me know.

Kind regards,

Shoaib