Re: [PATCH net-next] net: mana: Force full-page RX buffers for 4K page size on specific systems.
From: Paolo Abeni
Date: Tue Mar 03 2026 - 06:01:24 EST
On 2/27/26 11:15 AM, Dipayaan Roy wrote:
> On certain systems configured with 4K PAGE_SIZE, utilizing page_pool
> fragments for RX buffers results in a significant throughput regression.
> Profiling reveals that this regression correlates with high overhead in the
> fragment allocation and reference counting paths on these specific
> platforms, rendering the multi-buffer-per-page strategy counterproductive.
>
> To mitigate this, bypass the page_pool fragment path and force a single RX
> packet per page allocation when all the following conditions are met:
> 1. The system is configured with a 4K PAGE_SIZE.
> 2. A processor-specific quirk is detected via SMBIOS Type 4 data.
>
> This approach restores expected line-rate performance by ensuring
> predictable RX refill behavior on affected hardware.
>
> There is no behavioral change for systems using larger page sizes
> (16K/64K), or platforms where this processor-specific quirk do not
> apply.
>
> Signed-off-by: Dipayaan Roy <dipayanroy@xxxxxxxxxxxxxxxxxxx>
> ---
> .../net/ethernet/microsoft/mana/gdma_main.c | 120 ++++++++++++++++++
> drivers/net/ethernet/microsoft/mana/mana_en.c | 23 +++-
> include/net/mana/gdma.h | 10 ++
> 3 files changed, 151 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> index 0055c231acf6..26bbe736a770 100644
> --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> @@ -9,6 +9,7 @@
> #include <linux/msi.h>
> #include <linux/irqdomain.h>
> #include <linux/export.h>
> +#include <linux/dmi.h>
>
> #include <net/mana/mana.h>
> #include <net/mana/hw_channel.h>
> @@ -1955,6 +1956,115 @@ static bool mana_is_pf(unsigned short dev_id)
> return dev_id == MANA_PF_DEVICE_ID;
> }
>
> +/*
> + * Table for Processor Version strings found from SMBIOS Type 4 information,
> + * for processors that needs to force single RX buffer per page quirk for
> + * meeting line rate performance with ARM64 + 4K pages.
> + * Note: These strings are exactly matched with version fetched from SMBIOS.
> + */
> +static const char * const mana_single_rxbuf_per_page_quirk_tbl[] = {
> + "Cobalt 200",
> +};
> +
> +static const char *smbios_get_string(const struct dmi_header *hdr, u8 idx)
> +{
> + const u8 *start, *end;
> + u8 i;
> +
> + /* Indexing starts from 1. */
> + if (!idx)
> + return NULL;
> +
> + start = (const u8 *)hdr + hdr->length;
> + end = start + SMBIOS_STR_AREA_MAX;
> +
> + for (i = 1; i < idx; i++) {
> + while (start < end && *start)
> + start++;
> + if (start < end)
> + start++;
> + if (start + 1 < end && start[0] == 0 && start[1] == 0)
> + return NULL;
> + }
> +
> + if (start >= end || *start == 0)
> + return NULL;
> +
> + return (const char *)start;
If I read correctly, the above sort of duplicate dmi_decode_table().
I think you are better of:
- use the mana_get_proc_ver_from_smbios() decoder to store the
SMBIOS_TYPE4_PROC_VERSION_OFFSET index into gd
- do a 2nd walk with a different decoder to fetch the string at the
specified index.
/P