Re: [PATCH] ath11k: workaround firmware bug where peer_id=0

From: Matthew Leach

Date: Mon Mar 30 2026 - 03:58:18 EST


Hello,

Matthew Leach <matthew.leach@xxxxxxxxxxxxx> writes:

> This patch caches the peer enctype during the MSDU processing loop,
> caching it on the first AMSDU sub-frame (is_first_msdu=1
> is_last_msdu=0) and setting the correct enctype for any subsequent
> sub-MSDUs.

I've been looking at creating a patch that addresses the root cause,
rather than patching incoming frame's flags:

--8<---------------cut here---------------start------------->8---
diff --git a/drivers/net/wireless/ath/ath11k/peer.c b/drivers/net/wireless/ath/ath11k/peer.c
index 6d0126c39301..98348ccfdfbe 100644
--- a/drivers/net/wireless/ath/ath11k/peer.c
+++ b/drivers/net/wireless/ath/ath11k/peer.c
@@ -347,7 +347,7 @@ static int __ath11k_peer_delete(struct ath11k *ar, u32 vdev_id, const u8 *addr)
return 0;
}

-int ath11k_peer_delete(struct ath11k *ar, u32 vdev_id, u8 *addr)
+int ath11k_peer_delete(struct ath11k *ar, u32 vdev_id, const u8 *addr)
{
int ret;

@@ -372,7 +372,7 @@ int ath11k_peer_create(struct ath11k *ar, struct ath11k_vif *arvif,
{
struct ath11k_peer *peer;
struct ath11k_sta *arsta;
- int ret, fbret;
+ int ret, fbret, retries = 3;

lockdep_assert_held(&ar->conf_mutex);

@@ -400,6 +400,8 @@ int ath11k_peer_create(struct ath11k *ar, struct ath11k_vif *arvif,
spin_unlock_bh(&ar->ab->base_lock);
mutex_unlock(&ar->ab->tbl_mtx_lock);

+retry:
+
ret = ath11k_wmi_send_peer_create_cmd(ar, param);
if (ret) {
ath11k_warn(ar->ab,
@@ -427,6 +429,18 @@ int ath11k_peer_create(struct ath11k *ar, struct ath11k_vif *arvif,
goto cleanup;
}

+ if (!peer->peer_id) {
+ if (retries--) {
+ spin_unlock_bh(&ar->ab->base_lock);
+ mutex_unlock(&ar->ab->tbl_mtx_lock);
+ ath11k_peer_delete(ar, param->vdev_id, param->peer_addr);
+ goto retry;
+ } else {
+ ath11k_warn(ar->ab, "Null peer workaround failed for peer %pM, adding anyway",
+ param->peer_addr);
+ }
+ }
+
ret = ath11k_peer_rhash_add(ar->ab, peer);
if (ret) {
spin_unlock_bh(&ar->ab->base_lock);
diff --git a/drivers/net/wireless/ath/ath11k/peer.h b/drivers/net/wireless/ath/ath11k/peer.h
index 3ad2f3355b14..6325c4d157c7 100644
--- a/drivers/net/wireless/ath/ath11k/peer.h
+++ b/drivers/net/wireless/ath/ath11k/peer.h
@@ -47,7 +47,7 @@ struct ath11k_peer *ath11k_peer_find_by_addr(struct ath11k_base *ab,
const u8 *addr);
struct ath11k_peer *ath11k_peer_find_by_id(struct ath11k_base *ab, int peer_id);
void ath11k_peer_cleanup(struct ath11k *ar, u32 vdev_id);
-int ath11k_peer_delete(struct ath11k *ar, u32 vdev_id, u8 *addr);
+int ath11k_peer_delete(struct ath11k *ar, u32 vdev_id, const u8 *addr);
int ath11k_peer_create(struct ath11k *ar, struct ath11k_vif *arvif,
struct ieee80211_sta *sta, struct peer_create_params *param);
int ath11k_wait_for_peer_delete_done(struct ath11k *ar, u32 vdev_id,
--8<---------------cut here---------------end--------------->8---

This patch detects the error condition at the point where a peer map
request reply is received from the firmware. If the firmware maps with
peer_id=0, we request that the firmware unmap that peer and map again,
hoping it selects a peer_id!=0. We attempt this up to three times, at
which point we give up and let the peer be mapped with an ID of 0.

This patch addresses the root cause, but I think it's more invasive. I'd
appreciate some comments as to which approach upstream would prefer. If
the preference is for the above, I'll send out a v2.

Regards,
--
Matt