Re: [PATCH net] ixgbe: allow to increase MTU to some extent with XDP enalbed
From: Jason Xing
Date: Thu Jan 26 2023 - 23:17:00 EST
On Thu, Jan 26, 2023 at 11:56 PM Alexander Lobakin
<alexandr.lobakin@xxxxxxxxx> wrote:
>
> From: Maciej Fijalkowski <maciej.fijalkowski@xxxxxxxxx>
> Date: Thu, 26 Jan 2023 13:17:20 +0100
>
> > On Sat, Jan 21, 2023 at 04:55:21PM +0800, Jason Xing wrote:
> >> From: Jason Xing <kernelxing@xxxxxxxxxxx>
> >>
> >> I encountered one case where I cannot increase the MTU size with XDP
> >> enabled if the server is equipped with IXGBE card, which happened on
> >> thousands of servers. I noticed it was prohibited from 2017[1] and
> >> added size checks[2] if allowed soon after the previous patch.
> >>
> >> Interesting part goes like this:
> >> 1) Changing MTU directly from 1500 (default value) to 2000 doesn't
> >> work because the driver finds out that 'new_frame_size >
> >> ixgbe_rx_bufsz(ring)' in ixgbe_change_mtu() function.
> >> 2) However, if we change MTU to 1501 then change from 1501 to 2000, it
> >> does work, because the driver sets __IXGBE_RX_3K_BUFFER when MTU size
> >> is converted to 1501, which later size check policy allows.
> >>
> >> The default MTU value for most servers is 1500 which cannot be adjusted
> >> directly to the value larger than IXGBE_MAX_2K_FRAME_BUILD_SKB (1534 or
> >> 1536) if it loads XDP.
> >>
> >> After I do a quick study on the manner of i40E driver allowing two kinds
> >> of buffer size (one is 2048 while another is 3072) to support XDP mode in
> >> i40e_max_xdp_frame_size(), I believe the default MTU size is possibly not
> >> satisfied in XDP mode when IXGBE driver is in use, we sometimes need to
> >> insert a new header, say, vxlan header. So setting the 3K-buffer flag
> >> could solve the issue.
> >>
> >> [1] commit 38b7e7f8ae82 ("ixgbe: Do not allow LRO or MTU change with XDP")
> >> [2] commit fabf1bce103a ("ixgbe: Prevent unsupported configurations with
> >> XDP")
> >>
> >> Signed-off-by: Jason Xing <kernelxing@xxxxxxxxxxx>
> >> ---
> >> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 3 +++
> >> 1 file changed, 3 insertions(+)
> >>
> >> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> >> index ab8370c413f3..dc016582f91e 100644
> >> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> >> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> >> @@ -4313,6 +4313,9 @@ static void ixgbe_set_rx_buffer_len(struct ixgbe_adapter *adapter)
> >> if (IXGBE_2K_TOO_SMALL_WITH_PADDING ||
> >> (max_frame > (ETH_FRAME_LEN + ETH_FCS_LEN)))
> >> set_bit(__IXGBE_RX_3K_BUFFER, &rx_ring->state);
> >> +
> >> + if (ixgbe_enabled_xdp_adapter(adapter))
> >> + set_bit(__IXGBE_RX_3K_BUFFER, &rx_ring->state);
> >
> > This will result with unnecessary overhead for 1500 MTU because you will
> > be working on order-1 pages. Instead I would focus on fixing
> > ixgbe_change_mtu() and stop relying on ixgbe_rx_bufsz() in there. You can
> > check what we do on ice/i40e sides.
Well, now I see the commit 23b44513c3e6f in 2019. Thanks, Maciej.
> >
> > I'm not looking actively into ixgbe internals but I don't think that there
> > is anything that stops us from using 3k buffers with XDP.
>
> I think it uses the same logics as the rest of drivers: splits a 4k page
> into two 2k buffers when MTU is <= 1536, otherwise uses order-1 pages
> and uses 3k buffers.
>
> OTOH ixgbe is not fully correct in terms how it calculates Rx headroom,
> but the main problem is how it calculates the maximum MTU available when
> XDP is on. Our usual MTU supported when XDP is on is 3046 bytes.
> For MTU <= 1536, 2k buffers are used even for XDP, so the fix is not
> correct. Maciej is right that i40e and ice do that way better and don't
> have such issue.
Thank you for the detailed explanation. And yes, I checked this part
in the ice/i40e driver which introduces ice/i40e_max_xdp_frame_size()
to test if we can change MTU size when the driver is loading the XDP
program.
I will rewrite the patch as the i40e/ice does in the next submission.
Thanks,
Jason
>
> >
> >> #endif
> >> }
> >> }
> >> --
> >> 2.37.3
> >>
>
> Thanks,
> Olek