Re: [PATCH] net: do not release sk in sk_wait_event

From: Paolo Abeni
Date: Thu Aug 15 2024 - 06:40:28 EST




On 8/15/24 10:49, sunyiqi wrote:
When investigating the kcm socket UAF which is also found by syzbot,
I found that the root cause of this problem is actually in
sk_wait_event.

In sk_wait_event, sk is released and relocked and called by
sk_stream_wait_memory. Protocols like tcp, kcm, etc., called it in some
ops function like *sendmsg which will lock the sk at the beginning.
But sk_stream_wait_memory releases sk unexpectedly and destroy
the thread safety. Finally it causes the kcm sk UAF.

If at the time when a thread(thread A) calls sk_stream_wait_memory
and the other thread(thread B) is waiting for lock in lock_sock,
thread B will successfully get the sk lock as thread A release sk lock
in sk_wait_event.

The thread B may change the sk which is not thread A expecting.

As a result, it will lead kernel to the unexpected behavior. Just like
the kcm sk UAF, which is actually cause by sk_wait_event in
sk_stream_wait_memory.

Previous commit d9dc8b0f8b4e ("net: fix sleeping for sk_wait_event()")
in 2016 seems do not solved this problem. Is it necessary to release
sock in sk_wait_event? Or just delete it to make the protocol ops
thread-secure.

As a I wrote previously, please describe the suspected race more clearly, with the exact calls sequence that lead to the UAF.

Releasing the socket lock is not enough to cause UAF.

Removing the release/lock pair in sk_wait_event() will break many protocols (e.g. TCP) as the stack will not be able to land packets in the receive queue while the socked lock is owned.

Cheers,

Paolo