Re: [PATCH] ath11k: workaround firmware bug where peer_id=0

From: Baochen Qiang

Date: Tue Apr 14 2026 - 03:07:58 EST




On 3/30/2026 3:57 PM, Matthew Leach wrote:
> Hello,
>
> Matthew Leach <matthew.leach@xxxxxxxxxxxxx> writes:
>
>> This patch caches the peer enctype during the MSDU processing loop,
>> caching it on the first AMSDU sub-frame (is_first_msdu=1
>> is_last_msdu=0) and setting the correct enctype for any subsequent
>> sub-MSDUs.
>
> I've been looking at creating a patch that addresses the root cause,
> rather than patching incoming frame's flags:
>
> --8<---------------cut here---------------start------------->8---
> diff --git a/drivers/net/wireless/ath/ath11k/peer.c b/drivers/net/wireless/ath/ath11k/peer.c
> index 6d0126c39301..98348ccfdfbe 100644
> --- a/drivers/net/wireless/ath/ath11k/peer.c
> +++ b/drivers/net/wireless/ath/ath11k/peer.c
> @@ -347,7 +347,7 @@ static int __ath11k_peer_delete(struct ath11k *ar, u32 vdev_id, const u8 *addr)
> return 0;
> }
>
> -int ath11k_peer_delete(struct ath11k *ar, u32 vdev_id, u8 *addr)
> +int ath11k_peer_delete(struct ath11k *ar, u32 vdev_id, const u8 *addr)
> {
> int ret;
>
> @@ -372,7 +372,7 @@ int ath11k_peer_create(struct ath11k *ar, struct ath11k_vif *arvif,
> {
> struct ath11k_peer *peer;
> struct ath11k_sta *arsta;
> - int ret, fbret;
> + int ret, fbret, retries = 3;
>
> lockdep_assert_held(&ar->conf_mutex);
>
> @@ -400,6 +400,8 @@ int ath11k_peer_create(struct ath11k *ar, struct ath11k_vif *arvif,
> spin_unlock_bh(&ar->ab->base_lock);
> mutex_unlock(&ar->ab->tbl_mtx_lock);
>
> +retry:
> +
> ret = ath11k_wmi_send_peer_create_cmd(ar, param);
> if (ret) {
> ath11k_warn(ar->ab,
> @@ -427,6 +429,18 @@ int ath11k_peer_create(struct ath11k *ar, struct ath11k_vif *arvif,
> goto cleanup;
> }
>
> + if (!peer->peer_id) {
> + if (retries--) {
> + spin_unlock_bh(&ar->ab->base_lock);
> + mutex_unlock(&ar->ab->tbl_mtx_lock);
> + ath11k_peer_delete(ar, param->vdev_id, param->peer_addr);
> + goto retry;
> + } else {
> + ath11k_warn(ar->ab, "Null peer workaround failed for peer %pM, adding anyway",
> + param->peer_addr);
> + }
> + }
> +
> ret = ath11k_peer_rhash_add(ar->ab, peer);
> if (ret) {
> spin_unlock_bh(&ar->ab->base_lock);
> diff --git a/drivers/net/wireless/ath/ath11k/peer.h b/drivers/net/wireless/ath/ath11k/peer.h
> index 3ad2f3355b14..6325c4d157c7 100644
> --- a/drivers/net/wireless/ath/ath11k/peer.h
> +++ b/drivers/net/wireless/ath/ath11k/peer.h
> @@ -47,7 +47,7 @@ struct ath11k_peer *ath11k_peer_find_by_addr(struct ath11k_base *ab,
> const u8 *addr);
> struct ath11k_peer *ath11k_peer_find_by_id(struct ath11k_base *ab, int peer_id);
> void ath11k_peer_cleanup(struct ath11k *ar, u32 vdev_id);
> -int ath11k_peer_delete(struct ath11k *ar, u32 vdev_id, u8 *addr);
> +int ath11k_peer_delete(struct ath11k *ar, u32 vdev_id, const u8 *addr);
> int ath11k_peer_create(struct ath11k *ar, struct ath11k_vif *arvif,
> struct ieee80211_sta *sta, struct peer_create_params *param);
> int ath11k_wait_for_peer_delete_done(struct ath11k *ar, u32 vdev_id,
> --8<---------------cut here---------------end--------------->8---
>
> This patch detects the error condition at the point where a peer map
> request reply is received from the firmware. If the firmware maps with
> peer_id=0, we request that the firmware unmap that peer and map again,
> hoping it selects a peer_id!=0. We attempt this up to three times, at
> which point we give up and let the peer be mapped with an ID of 0.
>
> This patch addresses the root cause, but I think it's more invasive. I'd
> appreciate some comments as to which approach upstream would prefer. If
> the preference is for the above, I'll send out a v2.

for chips like QCA2066 and WCN6855 etc 0 is a valid value, however this is not for chips
like QCN9074 etc.

so a possible fix would be to add hardware ops based on chips: for QCN9074 we keep the
existing validation on 0 in the ops, while for QCA2066 the ops is a null func. Or even
simper we can remove the validation for all chips.

>
> Regards,