Re: [PATCH] wifi: ath11k: fix rx completion meta data corruption

From: Johan Hovold
Date: Mon Mar 24 2025 - 03:50:01 EST


On Sun, Mar 23, 2025 at 11:15:54PM -0700, Clayton Craft wrote:
> On 3/21/25 07:53, Johan Hovold wrote:
> > Add the missing memory barrier to make sure that the REO dest ring
> > descriptor is read after the head pointer to avoid using stale data on
> > weakly ordered architectures like aarch64.
> >
> > This may fix the ring-buffer corruption worked around by commit
> > f9fff67d2d7c ("wifi: ath11k: Fix SKB corruption in REO destination
> > ring") by silently discarding data, and may possibly also address user
> > reported errors like:
> >
> > ath11k_pci 0006:01:00.0: msdu_done bit in attention is not set
> >
> > Tested-on: WCN6855 hw2.1 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
> >
> > Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
> > Cc: stable@xxxxxxxxxxxxxxx # 5.6
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218005
> > Signed-off-by: Johan Hovold <johan+linaro@xxxxxxxxxx>
> > ---
> >
> > As I reported here:
> >
> > https://lore.kernel.org/lkml/Z9G5zEOcTdGKm7Ei@xxxxxxxxxxxxxxxxxxxx/
> >
> > the ath11k and ath12k appear to be missing a number of memory barriers
> > that are required on weakly ordered architectures like aarch64 to avoid
> > memory corruption issues.
> >
> > Here's a fix for one more such case which people already seem to be
> > hitting.
> >
> > Note that I've seen one "msdu_done" bit not set warning also with this
> > patch so whether it helps with that at all remains to be seen. I'm CCing
> > Jens and Steev that see these warnings frequently and that may be able
> > to help out with testing.
>
> Before this patch I was seeing this "msdu_done bit" an average of about
> 40 times per hour... e.g. a recent boot period of 43hrs saw 1600 of
> these msgs. I've been testing this patch for about 10 hours now
> connected to the same network etc, and haven't seen this "msdu_done bit"
> message once. So, even if it's not completely resolving this for
> everyone, it seems to be a huge improvement for me.
>
> 0006:01:00.0 Network controller: Qualcomm Technologies, Inc QCNFA765
> Wireless Network Adapter (rev 01)
> ath11k_pci 0006:01:00.0: chip_id 0x2 chip_family 0xb board_id 0x8c
> soc_id 0x400c0210
> ath11k_pci 0006:01:00.0: fw_version 0x11088c35 fw_build_timestamp
> 2024-04-17 08:34 fw_build_id
> WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
>
> Tested-by: Clayton Craft <clayton@xxxxxxxxxxxxx>

Thanks for testing and confirming my suspicion.

Johan