Re: [PATCH net v3] mptcp: pm: Fix uaf in __timer_delete_sync

From: Paolo Abeni
Date: Mon Sep 09 2024 - 09:07:39 EST


On 9/5/24 14:27, Edward Adam Davis wrote:
There are two paths to access mptcp_pm_del_add_timer, result in a race
condition:

CPU1 CPU2
==== ====
net_rx_action
napi_poll netlink_sendmsg
__napi_poll netlink_unicast
process_backlog netlink_unicast_kernel
__netif_receive_skb genl_rcv
__netif_receive_skb_one_core netlink_rcv_skb
NF_HOOK genl_rcv_msg
ip_local_deliver_finish genl_family_rcv_msg
ip_protocol_deliver_rcu genl_family_rcv_msg_doit
tcp_v4_rcv mptcp_pm_nl_flush_addrs_doit
tcp_v4_do_rcv mptcp_nl_remove_addrs_list
tcp_rcv_established mptcp_pm_remove_addrs_and_subflows
tcp_data_queue remove_anno_list_by_saddr
mptcp_incoming_options mptcp_pm_del_add_timer
mptcp_pm_del_add_timer kfree(entry)

In remove_anno_list_by_saddr(running on CPU2), after leaving the critical
zone protected by "pm.lock", the entry will be released, which leads to the
occurrence of uaf in the mptcp_pm_del_add_timer(running on CPU1).

Keeping a reference to add_timer inside the lock, and calling
sk_stop_timer_sync() with this reference, instead of "entry->add_timer".

Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
Cc: stable@xxxxxxxxxxxxxxx
Reported-and-tested-by: syzbot+f3a31fb909db9b2a5c4d@xxxxxxxxxxxxxxxxxxxxxxxxx
Closes: https://syzkaller.appspot.com/bug?extid=f3a31fb909db9b2a5c4d
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@xxxxxxxxxx>
Signed-off-by: Edward Adam Davis <eadavis@xxxxxx>
---
net/mptcp/pm_netlink.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 3e4ad801786f..7ddb373cc6ad 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -329,17 +329,21 @@ struct mptcp_pm_add_entry *
mptcp_pm_del_add_timer(struct mptcp_sock *msk,
const struct mptcp_addr_info *addr, bool check_id)
{
- struct mptcp_pm_add_entry *entry;
struct sock *sk = (struct sock *)msk;
+ struct timer_list *add_timer = NULL;
+ struct mptcp_pm_add_entry *entry;
spin_lock_bh(&msk->pm.lock);
entry = mptcp_lookup_anno_list_by_saddr(msk, addr);
- if (entry && (!check_id || entry->addr.id == addr->id))
+ if (entry && (!check_id || entry->addr.id == addr->id)) {
entry->retrans_times = ADD_ADDR_RETRANS_MAX;
+ add_timer = &entry->add_timer;
+ }
spin_unlock_bh(&msk->pm.lock);
- if (entry && (!check_id || entry->addr.id == addr->id))
- sk_stop_timer_sync(sk, &entry->add_timer);
+ /* no lock, because sk_stop_timer_sync() is calling del_timer_sync() */
+ if (add_timer)
+ sk_stop_timer_sync(sk, add_timer);
return entry;
}
@@ -1430,8 +1434,10 @@ static bool remove_anno_list_by_saddr(struct mptcp_sock *msk,
entry = mptcp_pm_del_add_timer(msk, addr, false);
if (entry) {
+ spin_lock_bh(&msk->pm.lock);
list_del(&entry->list);
kfree(entry);
+ spin_unlock_bh(&msk->pm.lock);

I'm sorry for the late feedback.

I think this is not enough to fix races for good, i.e.

mptcp_nl_remove_subflow_and_signal_addr() -> mptcp_pm_remove_anno_addr()
-> remove_anno_list_by_saddr()

could race with:

mptcp_pm_remove_addrs() -> remove_anno_list_by_saddr()

and both CPUs could see the same 'entry' returned by
mptcp_pm_del_add_timer().

I think the list_del() in remove_anno_list_by_saddr() should moved under
the pm lock protection inside mptcp_pm_del_add_timer(), and no need to add spin_lock_bh() around the kfree call.

Thanks,

Paolo