Re: [PATCH net-next] net: mana: Force full-page RX buffers for 4K page size on specific systems.

From: Dipayaan Roy

Date: Fri Mar 06 2026 - 08:13:31 EST


On Tue, Mar 03, 2026 at 11:56:29AM +0100, Paolo Abeni wrote:
> On 2/27/26 11:15 AM, Dipayaan Roy wrote:
> > On certain systems configured with 4K PAGE_SIZE, utilizing page_pool
> > fragments for RX buffers results in a significant throughput regression.
> > Profiling reveals that this regression correlates with high overhead in the
> > fragment allocation and reference counting paths on these specific
> > platforms, rendering the multi-buffer-per-page strategy counterproductive.
> >
> > To mitigate this, bypass the page_pool fragment path and force a single RX
> > packet per page allocation when all the following conditions are met:
> > 1. The system is configured with a 4K PAGE_SIZE.
> > 2. A processor-specific quirk is detected via SMBIOS Type 4 data.
> >
> > This approach restores expected line-rate performance by ensuring
> > predictable RX refill behavior on affected hardware.
> >
> > There is no behavioral change for systems using larger page sizes
> > (16K/64K), or platforms where this processor-specific quirk do not
> > apply.
> >
> > Signed-off-by: Dipayaan Roy <dipayanroy@xxxxxxxxxxxxxxxxxxx>
> > ---
> > .../net/ethernet/microsoft/mana/gdma_main.c | 120 ++++++++++++++++++
> > drivers/net/ethernet/microsoft/mana/mana_en.c | 23 +++-
> > include/net/mana/gdma.h | 10 ++
> > 3 files changed, 151 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > index 0055c231acf6..26bbe736a770 100644
> > --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > @@ -9,6 +9,7 @@
> > #include <linux/msi.h>
> > #include <linux/irqdomain.h>
> > #include <linux/export.h>
> > +#include <linux/dmi.h>
> >
> > #include <net/mana/mana.h>
> > #include <net/mana/hw_channel.h>
> > @@ -1955,6 +1956,115 @@ static bool mana_is_pf(unsigned short dev_id)
> > return dev_id == MANA_PF_DEVICE_ID;
> > }
> >
> > +/*
> > + * Table for Processor Version strings found from SMBIOS Type 4 information,
> > + * for processors that needs to force single RX buffer per page quirk for
> > + * meeting line rate performance with ARM64 + 4K pages.
> > + * Note: These strings are exactly matched with version fetched from SMBIOS.
> > + */
> > +static const char * const mana_single_rxbuf_per_page_quirk_tbl[] = {
> > + "Cobalt 200",
> > +};
> > +
> > +static const char *smbios_get_string(const struct dmi_header *hdr, u8 idx)
> > +{
> > + const u8 *start, *end;
> > + u8 i;
> > +
> > + /* Indexing starts from 1. */
> > + if (!idx)
> > + return NULL;
> > +
> > + start = (const u8 *)hdr + hdr->length;
> > + end = start + SMBIOS_STR_AREA_MAX;
> > +
> > + for (i = 1; i < idx; i++) {
> > + while (start < end && *start)
> > + start++;
> > + if (start < end)
> > + start++;
> > + if (start + 1 < end && start[0] == 0 && start[1] == 0)
> > + return NULL;
> > + }
> > +
> > + if (start >= end || *start == 0)
> > + return NULL;
> > +
> > + return (const char *)start;
>
> If I read correctly, the above sort of duplicate dmi_decode_table().
>
Yes, its not exported.

> I think you are better of:
> - use the mana_get_proc_ver_from_smbios() decoder to store the
> SMBIOS_TYPE4_PROC_VERSION_OFFSET index into gd
> - do a 2nd walk with a different decoder to fetch the string at the
> specified index.
Sure, will implement the 2nd walk for fetching string in v2.

>
> /P

Thank you Paolo, for the comments, and apologies in my delay in response as this week I am on-call.
I will send out v2 with the changes suggested.

Regards