Re: [syzbot] [net?] KASAN: slab-use-after-free Read in unix_stream_read_actor (2)

From: Shoaib Rao
Date: Sat Sep 07 2024 - 01:06:59 EST



On 9/6/2024 9:48 AM, Shoaib Rao wrote:

On 9/6/2024 5:37 AM, Eric Dumazet wrote:
On Thu, Sep 5, 2024 at 10:48 PM Shoaib Rao <rao.shoaib@xxxxxxxxxx> wrote:

On 9/5/2024 1:35 PM, Kuniyuki Iwashima wrote:
From: Shoaib Rao <rao.shoaib@xxxxxxxxxx>
Date: Thu, 5 Sep 2024 13:15:18 -0700
On 9/5/2024 12:46 PM, Kuniyuki Iwashima wrote:
From: Shoaib Rao <rao.shoaib@xxxxxxxxxx>
Date: Thu, 5 Sep 2024 00:35:35 -0700
Hi All,

I am not able to reproduce the issue. I have run the C program at least
100 times in a loop. In the I do get an EFAULT, not sure if that is
intentional or not but no panic. Should I be doing something
differently? The kernel version I am using is
v6.11-rc6-70-gc763c4339688. Later I can try with the exact version.
The -EFAULT is the bug meaning that we were trying to read an consumed skb.

But the first bug is in recvfrom() that shouldn't be able to read OOB skb
without MSG_OOB, which doesn't clear unix_sk(sk)->oob_skb, and later
something bad happens.

     socketpair(AF_UNIX, SOCK_STREAM, 0, [3, 4]) = 0
     sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\333", iov_len=1}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_OOB|MSG_DONTWAIT) = 1
     recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=NULL, msg_iovlen=0, msg_controllen=0, msg_flags=MSG_OOB}, MSG_OOB|MSG_WAITFORONE) = 1
     sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\21", iov_len=1}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_OOB|MSG_NOSIGNAL|MSG_MORE) = 1
recvfrom(3, "\21", 125, MSG_DONTROUTE|MSG_TRUNC|MSG_DONTWAIT, NULL, NULL) = 1
     recvmsg(3, {msg_namelen=0}, MSG_OOB|MSG_ERRQUEUE) = -1 EFAULT (Bad address)

I posted a fix officially:
https://urldefense.com/v3/__https://lore.kernel.org/netdev/20240905193240.17565-5-kuniyu@xxxxxxxxxx/__;!!ACWV5N9M2RV99hQ!IJeFvLdaXIRN2ABsMFVaKOEjI3oZb2kUr6ld6ZRJCPAVum4vuyyYwUP6_5ZH9mGZiJDn6vrbxBAOqYI$
Thanks that is great. Isn't EFAULT,  normally indicative of an issue
with the user provided address of the buffer, not the kernel buffer.
Normally, it's used when copy_to_user() or copy_from_user() or
something similar failed.

But this time, if you turn KASAN off, you'll see the last recvmsg()
returns 1-byte garbage instead of -EFAULT, so actually KASAN worked
on your host, I guess.
No it did not work. As soon as KASAN detected read after free it should
have paniced as it did in the report and I have been running the
syzbot's C program in a continuous loop. I would like to reproduce the
issue before we can accept the fix -- If that is alright with you. I
will try your new test case later and report back. Thanks for the patch
though.
KASAN does not panic unless you request it.

Documentation/dev-tools/kasan.rst

KASAN is affected by the generic ``panic_on_warn`` command line parameter.
When it is enabled, KASAN panics the kernel after printing a bug report.

By default, KASAN prints a bug report only for the first invalid memory access.
With ``kasan_multi_shot``, KASAN prints a report on every invalid access. This
effectively disables ``panic_on_warn`` for KASAN reports.

Alternatively, independent of ``panic_on_warn``, the ``kasan.fault=`` boot
parameter can be used to control panic and reporting behaviour:

- ``kasan.fault=report``, ``=panic``, or ``=panic_on_write`` controls whether
   to only print a KASAN report, panic the kernel, or panic the kernel on
   invalid writes only (default: ``report``). The panic happens even if
   ``kasan_multi_shot`` is enabled. Note that when using asynchronous mode of
   Hardware Tag-Based KASAN, ``kasan.fault=panic_on_write`` always panics on
   asynchronously checked accesses (including reads).

Hi Eric,

Thanks for the update. I forgot to mention that I I did set /proc/sys/kernel/panic_on_warn to 1. I ran the program over night in two separate windows, there are no reports and no panic. I first try to reproduce the issue, because if I can not, how can I be sure that I have fixed that bug? I may find another issue and fix it but not the one that I was trying to. Please be assured that I am not done, I continue to investigate the issue.

If someone has a way of reproducing the failure please kindly let me know.

Kind regards,

Shoaib

I have tried reproducing using the newly added tests but no luck. I will keep trying but if there is another occurrence please let me know. I am using an AMD system but that should not have any impact.

Shoaib

[root@turbo-2 af_unix]# git diff msg_oob.c
diff --git a/tools/testing/selftests/net/af_unix/msg_oob.c b/tools/testing/selftests/net/af_unix/msg_oob.c
index 535eb2c3d7d1..5fedb55adcf2 100644
--- a/tools/testing/selftests/net/af_unix/msg_oob.c
+++ b/tools/testing/selftests/net/af_unix/msg_oob.c
@@ -525,6 +525,30 @@ TEST_F(msg_oob, ex_oob_drop_2)
      }
 }
+TEST_F(msg_oob, ex_oob_oob)
+{
+       sendpair("x", 1, MSG_OOB);
+       epollpair(true);
+       siocatmarkpair(true);
+
+       recvpair("x", 1, 1, MSG_OOB);
+       epollpair(false);
+       siocatmarkpair(true);
+
+       sendpair("y", 1, MSG_OOB);
+       epollpair(true);
+       siocatmarkpair(true);
+
+       recvpair("", -EAGAIN, 1, 0);
+       epollpair(false);
+       siocatmarkpair(false);
+
+       recvpair("", -EINVAL, 1, MSG_OOB);
+       epollpair(false);
+       siocatmarkpair(false);
+}
+
+
 TEST_F(msg_oob, ex_oob_ahead_break)
 {
      sendpair("hello", 5, MSG_OOB);

[root@turbo-2 af_unix]# rm msg_oob
rm: remove regular file 'msg_oob'? y
[root@turbo-2 af_unix]# make msg_oob
gcc -isystem /home/rshoaib/debug_pnic/linux/tools/testing/selftests/../../../usr/include -D_GNU_SOURCE=     msg_oob.c   -o msg_oob

root@turbo-2 af_unix]# echo 1 > /proc/sys/kernel/panic_on_warn

./msg_oob
TAP version 13
1..40
# Starting 40 tests from 2 test cases.
#  RUN           msg_oob.no_peek.non_oob ...
#            OK  msg_oob.no_peek.non_oob
ok 1 msg_oob.no_peek.non_oob
#  RUN           msg_oob.no_peek.oob ...
#            OK  msg_oob.no_peek.oob
ok 2 msg_oob.no_peek.oob
#  RUN           msg_oob.no_peek.oob_drop ...
#            OK  msg_oob.no_peek.oob_drop
ok 3 msg_oob.no_peek.oob_drop
#  RUN           msg_oob.no_peek.oob_ahead ...
#            OK  msg_oob.no_peek.oob_ahead
ok 4 msg_oob.no_peek.oob_ahead
#  RUN           msg_oob.no_peek.oob_break ...
#            OK  msg_oob.no_peek.oob_break
ok 5 msg_oob.no_peek.oob_break
#  RUN           msg_oob.no_peek.oob_ahead_break ...
#            OK  msg_oob.no_peek.oob_ahead_break
ok 6 msg_oob.no_peek.oob_ahead_break
#  RUN           msg_oob.no_peek.oob_break_drop ...
#            OK  msg_oob.no_peek.oob_break_drop
ok 7 msg_oob.no_peek.oob_break_drop
#  RUN           msg_oob.no_peek.ex_oob_break ...
#            OK  msg_oob.no_peek.ex_oob_break
ok 8 msg_oob.no_peek.ex_oob_break
#  RUN           msg_oob.no_peek.ex_oob_drop ...
# msg_oob.c:242:ex_oob_drop:AF_UNIX :x
# msg_oob.c:243:ex_oob_drop:TCP     :Resource temporarily unavailable
# msg_oob.c:242:ex_oob_drop:AF_UNIX :y
# msg_oob.c:243:ex_oob_drop:TCP     :Invalid argument
#            OK  msg_oob.no_peek.ex_oob_drop
ok 9 msg_oob.no_peek.ex_oob_drop
#  RUN           msg_oob.no_peek.ex_oob_drop_2 ...
# msg_oob.c:242:ex_oob_drop_2:AF_UNIX :x
# msg_oob.c:243:ex_oob_drop_2:TCP     :Resource temporarily unavailable
#            OK  msg_oob.no_peek.ex_oob_drop_2
ok 10 msg_oob.no_peek.ex_oob_drop_2
#  RUN           msg_oob.no_peek.ex_oob_oob ...
# msg_oob.c:305:ex_oob_oob:Expected answ[0] (0) == oob_head (1)
# ex_oob_oob: Test terminated by assertion
#          FAIL  msg_oob.no_peek.ex_oob_oob
<...>
ok 38 msg_oob.peek.inline_ex_oob_no_drop
#  RUN           msg_oob.peek.inline_ex_oob_drop ...
# msg_oob.c:267:inline_ex_oob_drop:AF_UNIX :x
# msg_oob.c:268:inline_ex_oob_drop:TCP     :y
# msg_oob.c:267:inline_ex_oob_drop:AF_UNIX :x
# msg_oob.c:268:inline_ex_oob_drop:TCP     :y
# msg_oob.c:242:inline_ex_oob_drop:AF_UNIX :y
# msg_oob.c:243:inline_ex_oob_drop:TCP     :Resource temporarily unavailable
# msg_oob.c:242:inline_ex_oob_drop:AF_UNIX :y
# msg_oob.c:243:inline_ex_oob_drop:TCP     :Resource temporarily unavailable
#            OK  msg_oob.peek.inline_ex_oob_drop
ok 39 msg_oob.peek.inline_ex_oob_drop
#  RUN           msg_oob.peek.inline_ex_oob_siocatmark ...
#            OK  msg_oob.peek.inline_ex_oob_siocatmark
ok 40 msg_oob.peek.inline_ex_oob_siocatmark
# FAILED: 38 / 40 tests passed.
# Totals: pass:38 fail:2 xfail:0 xpass:0 skip:0 error:0

[root@turbo-2 af_unix]# uname -r
6.11.0-rao-rc6-gc763c4339688-dirty

[root@turbo-2 af_unix]# journalctl -r | grep -i kasan
Sep  6 21:15:25 turbo-2 kernel: kasan: KernelAddressSanitizer initialized