Re: [PATCH net] net/smc: fix general protection fault in __smc_diag_dump

From: Wang Liang
Date: Thu Apr 03 2025 - 03:09:41 EST



在 2025/4/2 15:20, D. Wythe 写道:
On Wed, Apr 02, 2025 at 10:37:24AM +0800, Wang Liang wrote:
在 2025/4/1 19:01, Paolo Abeni 写道:
On 3/31/25 10:10 AM, Wang Liang wrote:
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 3e6cb35baf25..454801188514 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -371,6 +371,7 @@ void smc_sk_init(struct net *net, struct sock *sk, int protocol)
sk->sk_protocol = protocol;
WRITE_ONCE(sk->sk_sndbuf, 2 * READ_ONCE(net->smc.sysctl_wmem));
WRITE_ONCE(sk->sk_rcvbuf, 2 * READ_ONCE(net->smc.sysctl_rmem));
+ smc->clcsock = NULL;
INIT_WORK(&smc->tcp_listen_work, smc_tcp_listen_work);
INIT_WORK(&smc->connect_work, smc_connect_work);
INIT_DELAYED_WORK(&smc->conn.tx_work, smc_tx_work);
The syzkaller report has a few reproducers, have you tested this? AFAICS
the smc socket is already zeroed on allocation by sk_alloc().

Yes, I test it by the C repro:
https://syzkaller.appspot.com/text?tag=ReproC&x=13d2dc98580000

The C repro is provided by the 2025/02/27 15:16 crash from
  https://syzkaller.appspot.com/bug?extid=271fed3ed6f24600c364

After apply my patch, the crash no longer happens when running the C repro.

"the smc socket is already zeroed on allocation by sk_alloc()", That
is right.
However, smc->clcsock may be modified indirectly in inet6_create().
The process like this:

  __sys_socket
    __sys_socket_create
      sock_create
        __sock_create
          # pf->create
          inet6_create
            // init smc->clcsock = 0
            sk = sk_alloc()

            // set smc->clcsock to invalid address
            inet = inet_sk(sk);
            inet_assign_bit(IS_ICSK, sk, INET_PROTOSW_ICSK & answer_flags);
            inet6_set_bit(MC6_LOOP, sk);
            inet6_set_bit(MC6_ALL, sk);

            smc_inet_init_sock
              smc_sk_init
                // add sk to smc_hash
                smc_hash_sk
                  sk_add_node(sk, head);
              smc_create_clcsk
                // set smc->clcsock
                sock_create_kern(..., &smc->clcsock);)

So initialize smc->clcsock to NULL explicitly in smc_sk_init() can fix
this crash scene. If the problem can be reproduced after this patch, I
guess it is not the same reason, and fix it by another patch is more
appropriate.

This is actually because the current smc_sock is not an inet_sock,
leading to two modules simultaneously modifying the same offset in
memory but interpreting its structure differently. I previously proposed
embedding an inet(6)_sock at the beginning of smc_sock, but the
community had some objections...

I'm not sure on the community's current stance on this matter, but if a
fix is absolutely necessary, my recommendation would still be to embed
an inet(6)_sock within the smc_sock structure

D.

At present, I think initializing the smc in smc_sk_init() may be the most simple and effective method. :P


/P