Re: [PATCH] ipv6: Honor route mtu if it is within limit of dev mtu
From: David Ahern
Date: Wed Feb 24 2021 - 15:30:24 EST
On 2/22/21 9:32 AM, Kaustubh Pandey wrote:
> When netdevice MTU is increased via sysfs, NETDEV_CHANGEMTU is raised.
>
> addrconf_notify -> rt6_mtu_change -> rt6_mtu_change_route ->
> fib6_nh_mtu_change
>
> As part of handling NETDEV_CHANGEMTU notification we land up on a
> condition where if route mtu is less than dev mtu and route mtu equals
> ipv6_devconf mtu, route mtu gets updated.
>
> Due to this v6 traffic end up using wrong MTU then configured earlier.
> This commit fixes this by removing comparison with ipv6_devconf
> and updating route mtu only when it is greater than incoming dev mtu.
>
> This can be easily reproduced with below script:
> pre-condition:
> device up(mtu = 1500) and route mtu for both v4 and v6 is 1500
>
> test-script:
> ip route change 192.168.0.0/24 dev eth0 src 192.168.0.1 mtu 1400
> ip -6 route change 2001::/64 dev eth0 metric 256 mtu 1400
> echo 1400 > /sys/class/net/eth0/mtu
> ip route change 192.168.0.0/24 dev eth0 src 192.168.0.1 mtu 1500
> echo 1500 > /sys/class/net/eth0/mtu
>
> Signed-off-by: Kaustubh Pandey <kapandey@xxxxxxxxxxxxxx>
> ---
> net/ipv6/route.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 1536f49..653b6c7 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -4813,8 +4813,7 @@ static int fib6_nh_mtu_change(struct fib6_nh *nh, void *_arg)
> struct inet6_dev *idev = __in6_dev_get(arg->dev);
> u32 mtu = f6i->fib6_pmtu;
>
> - if (mtu >= arg->mtu ||
> - (mtu < arg->mtu && mtu == idev->cnf.mtu6))
> + if (mtu >= arg->mtu)
> fib6_metric_set(f6i, RTAX_MTU, arg->mtu);
>
> spin_lock_bh(&rt6_exception_lock);
>
The existing logic mirrors what is done for exceptions, see
rt6_mtu_change_route_allowed and commit e9fa1495d738.
It seems right to me to drop the mtu == idev->cnf.mtu6 comparison in
which case the exceptions should do the same.
Added author of e9fa1495d738 in case I am overlooking something.
Test case should be added to tools/testing/selftests/net/pmtu.sh, and
did you run that script with the proposed change?