Re: [PATCH net] net/smc: Avoid setting clcsock options after clcsock released
From: Karsten Graul
Date: Tue Jan 11 2022 - 05:04:07 EST
On 10/01/2022 10:38, Wen Gu wrote:
> We encountered a crash in smc_setsockopt() and it is caused by
> accessing smc->clcsock after clcsock was released.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000020
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E 5.16.0-rc4+ #53
> RIP: 0010:smc_setsockopt+0x59/0x280 [smc]
> Call Trace:
> <TASK>
> __sys_setsockopt+0xfc/0x190
> __x64_sys_setsockopt+0x20/0x30
> do_syscall_64+0x34/0x90
> entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f16ba83918e
> </TASK>
>
> This patch tries to fix it by holding clcsock_release_lock and
> checking whether clcsock has already been released. In case that
> a crash of the same reason happens in smc_getsockopt(), this patch
> also checkes smc->clcsock in smc_getsockopt().
>
> Signed-off-by: Wen Gu <guwen@xxxxxxxxxxxxxxxxx>
> ---
> net/smc/af_smc.c | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index 1c9289f..af423f4 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -2441,6 +2441,11 @@ static int smc_setsockopt(struct socket *sock, int level, int optname,
> /* generic setsockopts reaching us here always apply to the
> * CLC socket
> */
> + mutex_lock(&smc->clcsock_release_lock);
> + if (!smc->clcsock) {
> + mutex_unlock(&smc->clcsock_release_lock);
> + return -EBADF;
> + }
> if (unlikely(!smc->clcsock->ops->setsockopt))
> rc = -EOPNOTSUPP;
> else
> @@ -2450,6 +2455,7 @@ static int smc_setsockopt(struct socket *sock, int level, int optname,
> sk->sk_err = smc->clcsock->sk->sk_err;
> sk_error_report(sk);
> }
> + mutex_unlock(&smc->clcsock_release_lock);
In the switch() the function smc_switch_to_fallback() might be called which also
accesses smc->clcsock without further checking. This should also be protected then?
Also from all callers of smc_switch_to_fallback() ?
There are more uses of smc->clcsock (e.g. smc_bind(), ...), so why does this problem
happen in setsockopt() for you only? I suspect it depends on the test case.
I wonder if it makes sense to check and protect smc->clcsock at all places in the code where
it is used... as of now we had no such races like you encountered. But I see that in theory
this problem could also happen in other code areas.
>
> if (optlen < sizeof(int))
> return -EINVAL;
> @@ -2509,13 +2515,21 @@ static int smc_getsockopt(struct socket *sock, int level, int optname,
> char __user *optval, int __user *optlen)
> {
> struct smc_sock *smc;
> + int rc;
>
> smc = smc_sk(sock->sk);
> + mutex_lock(&smc->clcsock_release_lock);
> + if (!smc->clcsock) {
> + mutex_unlock(&smc->clcsock_release_lock);
> + return -EBADF;
> + }
> /* socket options apply to the CLC socket */
> if (unlikely(!smc->clcsock->ops->getsockopt))
> return -EOPNOTSUPP;
> - return smc->clcsock->ops->getsockopt(smc->clcsock, level, optname,
> + rc = smc->clcsock->ops->getsockopt(smc->clcsock, level, optname,
> optval, optlen);
> + mutex_unlock(&smc->clcsock_release_lock);
> + return rc;
> }
>
> static int smc_ioctl(struct socket *sock, unsigned int cmd,
--
Karsten