bridge:fragmented packets dropped by bridge

From: Vyacheslav Salnikov
Date: Fri Oct 07 2022 - 07:21:38 EST


Hi.

I switched from kernel versions 4.9 to 5.15 and found that the MTU on
the interfaces in the bridge does not change.
For example:
I have the following bridge:
bridge interface
br0 sw1
sw2
sw3

And I change with ifconfig MTU.
I see that br0 sw1..sw3 has changed MTU from 1500 -> 1982.

But if i send a ping through these interfaces, I get 1500(I added
prints for output)
I investigated the code and found the reason:
The following commit came in the new kernel:
https://github.com/torvalds/linux/commit/ac6627a28dbfb5d96736544a00c3938fa7ea6dfb

And the behavior of the MTU setting has changed:
>
> Kernel 4.9:
> if (net->ipv4.sysctl_ip_fwd_use_pmtu ||
> ip_mtu_locked(dst) ||
> !forwarding) <--- True
> return dst_mtu(dst) <--- 1982
>
>
> / 'forwarding = true' case should always honour route mtu /
> mtu = dst_metric_raw(dst, RTAX_MTU);
> if (mtu)
> return mtu;



Kernel 5.15:
>
> if (READ_ONCE(net->ipv4.sysctl_ip_fwd_use_pmtu) ||
> ip_mtu_locked(dst) ||
> !forwarding) { <--- True
> mtu = rt->rt_pmtu; <--- 0
> if (mtu && time_before(jiffies, rt->dst.expires)) <-- False
> goto out;
> }
>
> / 'forwarding = true' case should always honour route mtu /
> mtu = dst_metric_raw(dst, RTAX_MTU); <---- 1500
> if (mtu) <--- True
> goto out;

As I see from the code in the end takes mtu from br_dst_default_metrics
> static const u32 br_dst_default_metrics[RTAX_MAX] = {
> [RTAX_MTU - 1] = 1500,
> };

Why is rt_pmtu now used instead of dst_mtu?
Why is forwarding = False called with dst_metric_raw?
Maybe we should add processing when mtu = rt->rt_pmtu == 0?
Could this be an error?


I found a thread discussing a similar problem. It suggested porting
the next patch:
Signed-off-by: Rundong Ge <rdong.ge@xxxxxxxxx>
---
include/net/ip.h | 2 ++
1 file changed, 2 insertions(+)

diff --git a/include/net/ip.h b/include/net/ip.h
index 29d89de..0512de3 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -450,6 +450,8 @@ static inline unsigned int
ip_dst_mtu_maybe_forward(const struct dst_entry *dst,
static inline unsigned int ip_skb_dst_mtu(struct sock *sk,
const struct sk_buff *skb)
{
+ if ((skb_dst(skb)->flags & DST_FAKE_RTABLE) && skb->dev)
+ return min(skb->dev->mtu, IP_MAX_MTU);
if (!sk || !sk_fullsock(sk) || ip_sk_use_pmtu(sk)) {
bool forwarding = IPCB(skb)->flags & IPSKB_FORWARDED;


Why was this patch not accepted in the end?
--
Best regards,
Slava.